
arXiv:2606.06586v1 Announce Type: new Abstract: Large language models (LLMs) trained predominantly on English data encode substantial world knowledge, yet often fail to express it reliably in other languages, a phenomenon known as cross-lingual factual inconsistency. To study and address this, we introduce PolyFact, a large-scale parallel multilingual factual QA dataset containing 100K Wikidata-grounded facts across 12 typologically diverse languages. Using PolyFact, we compare light continual pretraining (CPT), supervised fine-tuning (SFT), and reinforcement learning via Group Relative Policy
The proliferation of LLMs globally highlights the critical need for cross-lingual performance improvement, moving beyond English-centric training.
Improving cross-lingual factual recall directly impacts the global utility and trustworthiness of LLMs, enabling broader adoption and reducing bias.
LLMs can now be more reliably deployed in non-English contexts, providing more accurate information and reducing factual inconsistencies across languages.
- · Non-English speaking markets
- · Multilingual AI developers
- · Global information services
- · Emerging market economies
- · English-centric AI models
- · Monolingual data providers
Increased reliability and adoption of AI in diverse linguistic communities.
Accelerated development of AI applications tailored for specific non-English markets, potentially fostering new economic growth sectors.
Reduced information asymmetry globally as AI becomes a more equitable tool for knowledge access and creation, potentially shifting geopolitical influence.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL