New self-supervised method enriches medical imaging reports by adding omitted positive findings, boosting vision-language model performance by up to 7.47%

arXiv cs.LGApr 14, 20261 min read

Summaries like this, in your inbox every morning.

3 Key Points

SemEnrich addresses the bias in medical datasets where clinicians predominantly report abnormalities while omitting positive/neutral findings
The method uses semantic clustering of report sentences to automatically enrich training data with relevant observations from different clusters
Testing showed significant improvements: 5.63% gain on COMET score, 7.47% on RadGraph-F1, 7.40% on Sentence BLEU, and 5.30% on CheXbert-F1
Ablation studies confirmed that semantic clustering drives improvements, not random data augmentation
Researchers also developed a way to incorporate semantic cluster information into reward design for GRPO training

No comments yet. Be the first to share your thoughts!

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Free · takes 30 seconds · unsubscribe anytime

1 minute a day. The AI essentials.

200+ sources · Email / LINE / Slack