
arXiv:2604.17134v2 Announce Type: replace Abstract: We present RoIt-XMASA, a multilingual dataset that extends the Cross-lingual Multi-domain Amazon Sentiment Analysis to Italian and Romanian, comprising 36,000 labeled reviews across three domains (books, movies, and music) and 202,141 unlabeled samples. To address cross-lingual and cross-domain challenges, we propose a multi-target adversarial training framework that employs loss reversal with meta-learned coefficients to dynamically balance sentiment discrimination with domain and language invariance. XLM-R achieves an F1-score of 66.23% wit
The continuous drive for more robust and inclusive AI models necessitates the creation of specialized, multilingual datasets to overcome existing limitations.
This development indicates progress in building AI models that can operate effectively across diverse languages and domains, reducing reliance on English-centric data and potentially enabling broader AI adoption.
The availability of RoIt-XMASA and the proposed multi-target adversarial training framework offer new tools to improve sentiment analysis in less-resourced languages like Romanian and Italian, fostering more inclusive AI applications.
- · AI researchers in natural language processing
- · Companies operating in Central and Southern European markets
- · Developers of multilingual AI applications
- · Users of AI in Romanian and Italian speaking regions
- · Monolingual AI solutions without expansion capabilities
- · Companies relying solely on English-centric sentiment analysis
Improved sentiment analysis accuracy for Romanian and Italian in commercial and research applications.
Accelerated development of more sophisticated, culturally nuanced AI models for these languages, leading to better customer service and content moderation.
Reduced digital language barriers and increased economic opportunities for businesses and individuals in these linguistic markets, potentially fostering sovereign AI capabilities at a regional level.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL