TeamHerald@CHIPSAL 2026: Hate Speech Detection and Sentiment Analysis of Nepali Memes using Transformer-based Architectures and Ensemble Learning

arXiv:2606.08770v1 Announce Type: cross Abstract: The analysis of internet memes in the Nepali language is complicated by frequent code-mixing and a lack of established baseline resources. While memes inherently combine visual and textual elements, this study focuses on a text-centric approach by extracting embedded text using an OCR layer and modeling it with Transformer-based architectures. We evaluate six distinct models and investigate the comparative effectiveness of Hard and Soft Voting ensemble strategies across two tasks: binary hate speech detection and three-class sentiment analysis.
The study's publication ahead of CHIPSAL 2026 reflects ongoing academic efforts to address language-specific challenges in AI, particularly concerning hate speech and sentiment analysis.
This research contributes to the development of AI models capable of handling linguistic diversity and code-mixing, crucial for broader AI application and ethical content moderation in non-dominant languages.
The explicit focus on Nepali memes and code-mixing provides specific methodologies for under-resourced languages, potentially leading to more accurate and culturally nuanced AI tools.
- · AI ethicists
- · NLP researchers
- · Platforms with diverse user bases
- · Nepali-speaking internet users
- · Hate speech creators in Nepali
- · Platforms with unsophisticated content moderation
Improved detection of harmful content in Nepali online spaces.
Potential for replication of these methods in other low-resource languages facing similar content moderation challenges.
Reduced spread of misinformation or targeted harassment, fostering safer online communities for diverse linguistic groups.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG