CoughSense: Five-Class Respiratory Disease Classification via Whisper Encoder Fine-Tuning and Dual-Encoder Cross-Attention Fusion with Balanced Contrastive Learning

arXiv:2606.02998v1 Announce Type: new Abstract: Automated cough analysis offers a path to low-cost respiratory screening, but most existing work stops at binary COVID-19 detection. A practical tool needs to tell apart several respiratory conditions from one cough recording on a consumer smartphone. We present CoughSense, a system that sorts cough recordings into five classes. These are healthy, COVID-19, asthma or respiratory condition, bronchitis, and pneumonia. We aggregated 18,301 recordings from four public datasets (Coswara, CoughVID, Virufy, and the West China Hospital Pediatric Cough Da
The proliferation of consumer smartphones and advancements in AI, particularly within audio processing and large language models (like Whisper), enable new applications for health screening at scale.
This technology offers a low-cost, accessible method for early detection and differentiation of multiple respiratory diseases, potentially reducing healthcare burdens and improving public health outcomes globally.
The ability to accurately classify five distinct respiratory conditions from a single cough recording on a smartphone shifts diagnostics from specialized clinics to ubiquitous personal devices.
- · Digital health platforms
- · Smartphone manufacturers
- · Low-income regions
- · AI developers in audio processing
- · Traditional diagnostic labs (for initial screening)
- · Healthcare providers reliant on manual differential diagnosis
Widespread adoption could lead to earlier disease detection and more efficient allocation of medical resources.
The data collected could further train and refine AI models, creating a virtuous cycle of improved diagnostic accuracy and new public health insights.
This could set a precedent for other AI-powered, non-invasive home diagnostics, decentralizing elements of primary care.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG