Cross-lingual Relation Extraction with Large Language Models: Zero-Shot, Few-Shot, and Fine-Tuned Evaluation on Romanian

arXiv:2606.31718v1 Announce Type: new Abstract: Relation extraction (RE) for low-resource languages is typically constrained by the lack of annotated corpora. We investigate the feasibility of cross-lingual RE for Romanian by combining automatic dataset translation with large language model (LLM) inference. We translate the SemEval-2010 Task 8 benchmark from English to Romanian using an LLM-based translation pipeline and evaluate Gemma 4 31B under zero-shot, few-shot, and QLoRA fine-tuned configurations, against four encoder baselines spanning 125M to 560M parameters: XLM- RoBERTa (base and la
The proliferation of advanced LLMs and increasing interest in extending their capabilities to low-resource languages drives this research now.
This work is important for strategic readers as it demonstrates a viable pathway for AI development in languages outside of major global tongues, potentially decentralizing AI advancement and reducing language barriers in technology.
The ability to perform cross-lingual relation extraction for low-resource languages with LLMs changes the landscape by making advanced AI capabilities more accessible and adaptable across diverse linguistic contexts.
- · AI developers in low-resource language communities
- · Multilingual NLP platforms
- · Governments investing in domestic AI capabilities
- · Monolingual AI research paradigms
Increased availability of sophisticated NLP tools for languages with limited digital resources.
Accelerated development of AI applications and services tailored to specific linguistic and cultural markets.
Enhanced linguistic diversity in the global AI ecosystem, potentially challenging the dominance of major languages in AI development.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL