FADA: Accessible fetal ultrasound interpretation and annotation with a selectively distilled unified vision-language model

arXiv:2606.11106v1 Announce Type: cross Abstract: A global shortage of trained sonographers limits prenatal ultrasound screening in low- and middle-income countries, where over half of pregnant women receive no skilled sonography. Current deep learning approaches address detection, segmentation, or classification in isolation, each demanding a separate model and expert-specified labels at inference. We present FADA, a unified vision-language model built on Qwen3.5-VL that performs clinical interpretation, classification, detection, and segmentation through a single interpretation-first pipelin
The proliferation of advanced vision-language models, coupled with urgent global health needs, makes this application timely and impactful.
This development addresses a critical healthcare gap in low- and middle-income countries and showcases the practical utility of unified AI models for complex medical tasks.
Fetal ultrasound interpretation can become more accessible and standardized globally, potentially reducing maternal and infant mortality rates previously exacerbated by sonographer shortages.
- · Patients in low- and middle-income countries
- · Global health organizations
- · AI healthcare solution providers
- · Developers of unified vision-language models
- · Traditional medical imaging education institutions (if they do not adapt)
- · Consulting firms reliant on sonographer shortages
Increased early detection of fetal complications in underserved regions.
A significant reduction in perinatal morbidity and mortality rates, leading to healthier populations.
This success could accelerate the adoption of similar unified AI models across other medical and diagnostic fields, redefining healthcare delivery models globally.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI