A Survey of Text and Speech Resources for Hausa and Fongbe: Availability, Quality, and Gaps for NLP Development

arXiv:2605.22828v1 Announce Type: new Abstract: This survey provides a comprehensive catalog of publicly available text and speech resources for two West African languages: Hausa, an Afroasiatic language with approximately 80-100 million speakers, and Fongbe, a Niger-Congo language spoken by approximately 2 million people in Benin. These languages represent contrasting cases on the resource availability spectrum. We address the question: \textit{What is the current state of publicly available NLP resources for Hausa and Fongbe, and what gaps remain?} Through systematic search of academic repos
The proliferation of AI models is driving a global effort to expand language resource availability, particularly for under-resourced languages, to ensure broader inclusivity and development.
This survey highlights an ongoing push to expand AI's linguistic reach beyond dominant languages, which is critical for global AI development and national digital sovereignty.
The explicit cataloging of available and missing resources for Hausa and Fongbe provides a clearer roadmap for AI development in these specific language domains, exposing concrete gaps that need addressing.
- · West African AI developers
- · Hausa and Fongbe speaking populations
- · Linguistic data collection initiatives
- · Companies relying solely on large English/dominant language datasets
Increased investment and focused efforts will be directed towards creating NLP resources for Hausa and Fongbe.
Improved AI models for these languages could unlock new economic and social opportunities in the regions where they are spoken.
The success in these languages could spur similar initiatives for other under-resourced languages globally, leading to a more linguistically diverse AI landscape.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL