
arXiv:2605.27161v1 Announce Type: new Abstract: This paper reports the core linguistic work performed to construct a dictionary-based morphological analyser for Malagasy simple verbs. It uses the Unitex platform and comprised the contruction of an electronic dictionary for Malagasy simple verbs. The data is encoded on the basis of morphological features. The morphological variations of verb stems and their combination with inflectional affixes are formalized in finite-state transducers represented by editable graphs. 78 transducers allow Unitex to generate a dictionary of allomorphs of stems.
The continuous development in natural language processing (NLP) and computational linguistics facilitates detailed morphological analysis for less-resourced languages, leveraging established platforms like Unitex.
This work contributes to the foundational linguistic data required for developing robust AI applications for Malagasy, a step towards broader language inclusivity in AI.
The availability of a formal morphological analyzer for Malagasy simple verbs will enable more accurate processing and understanding of this language in AI systems.
- · Malagasy language speakers
- · Computational linguists
- · AI developers focused on low-resource languages
- · Unitex platform developers
Improved machine translation and natural language understanding for Malagasy.
Potential for increased digital content creation and access in Malagasy.
Enhanced cultural preservation and educational resources for Malagasy communities globally.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL