Low Resource Multimodal Translation of Nepali Spoken Words into Emotion-Conditioned Sign Language Avatars

arXiv:2606.26107v1 Announce Type: cross Abstract: Sign language communication systems, that integrate emotional expression remain underexplored, particularly for low-resource languages. This pilot study presents NEST-V1 (Nepali Emotion and Speech Transformer - Version 1), a proof-of-concept multimodal framework that demonstrates the feasibility of generating emotion-conditioned Nepali Sign Language avatars from spoken input. As a preliminary investigation, we focus on four common Nepali words ("thank you", "hello", "house", "me") across three emotional states (happy, neutral, sad) to validate
The proliferation of AI and advanced computational linguistics enables the development of complex multimodal translation systems for historically underserved languages.
This demonstration highlights the potential for AI to bridge communication gaps for low-resource languages and communities, expanding accessibility and inclusivity.
The feasibility of creating emotion-conditioned sign language avatars from spoken input for low-resource languages is now demonstrated, albeit for a limited vocabulary.
- · Sign language communities
- · NLP researchers
- · Humanitarian organizations
- · AI avatar developers
- · Traditional translation services (long-term, specialized niches)
It provides a proof-of-concept for similar systems in other low-resource languages, potentially accelerating development.
This could lead to improved assistive technologies and educational tools for the deaf and hard-of-hearing community globally.
The development of highly expressive and nuanced AI-driven communication tools could subtly reshape human-computer interaction paradigms, with implications for metaverse and virtual reality applications.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI