Layer-wise Probing of wav2vec 2.0 and Whisper for Consonant Cluster Reduction in African American English

arXiv:2606.23948v1 Announce Type: new Abstract: Self-supervised and supervised speech models are increasingly used to investigate which linguistic information their internal representations encode, and at what level of abstraction they encode it. One underexplored phenomenon is consonant cluster reduction (CCR) in African American English (AAE), a widespread phonological process and a source of automatic speech recognition (ASR) disparity. To examine how CCR is represented, we conduct speaker-independent layer-wise probing of wav2vec2-base and Whisper-small using two tasks: segmental reduction
The proliferation of speech models necessitates deeper analysis into their linguistic biases and representational capabilities, particularly for underserved dialects, as ASR systems become more integrated into critical applications.
This research highlights specific biases in foundational AI speech models concerning African American English, underscoring the need for more inclusive training data and model design to prevent technological disparities.
Understanding these biases provides a clearer path for developing more equitable and accurate speech recognition technologies for diverse linguistic communities, potentially influencing future model development and deployment standards.
- · Linguistic research institutions
- · Developers of inclusive AI models
- · African American English speakers
- · Companies with biased ASR systems
- · Monolingual AI development approaches
Improved fairness and accuracy in ASR systems for African American English.
Increased demand for linguistically diverse datasets and researchers in AI development.
Potentially, regulation or industry standards requiring cultural and linguistic equity audits for AI models, especially in public-facing applications.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL