Researchers achieve 32.87 score on clinical QA task using two-stage QLoRA fine-tuning of Qwen3-4B model
arXiv cs.CL · April 17, 2026
AI Summary
•QU-NLP team applies two-stage Quantised Low-Rank Adaptation (QLoRA) to Qwen3-4B loaded in 4-bit NF4 quantisation for the ArchEHR-QA 2026 shared task
•System first trained on 30,000 samples from emrQA-MedSQuAD corpus for clinical domain knowledge, then on 20 annotated development cases for task-specific output
•Subtask 3 (answer generation) achieves overall score of 32.87 with BLEU=9.42, ROUGE-L=27.04, SARI=55.42, BERTScore=43.00, and MEDCON=37.04
•Subtask 4 (evidence alignment) uses weighted ensemble of BM25, TF-IDF, and fine-tuned cross-encoder to reach 67.16 micro-F1 score on 100-case test set