Multilingual and Cross-Lingual Citation Needed Detection on Wikipedia for Lower-Resource Languages

arXiv:2605.31136v1 Announce Type: new Abstract: In automated fact-checking (AFC), check-worthiness detection identifies claims requiring verification based on domain-specific criteria. On Wikipedia, this task instantiates as Citation Needed Detection (CND), which flags claims lacking supporting citations. However, existing research has largely overlooked lower-resource languages, and recent AFC pipelines rely on large language models (LLMs), which are inaccessible to low-resource organizations. We introduce MCN, a multilingual CND corpus spanning 18 languages across three resource levels, on w
The proliferation of AI-generated content and the increasing reliance on LLMs necessitate better automated fact-checking, especially in diverse linguistic contexts.
This work directly addresses a critical gap in AI accessibility and reliability for global information systems by enabling fact-checking in lower-resource languages.
The availability of multilingual citation-needed detection systems expands the reach and fairness of automated content verification beyond English-centric models, making LLMs more globally applicable.
- · Lower-resource language communities
- · Organizations focused on information integrity
- · Developers of multilingual AI systems
- · Wikipedia and similar collaborative platforms
- · Propagators of misinformation in lower-resource languages
- · Monolingual AI development approaches
Improved truthfulness and quality of information on platforms like Wikipedia across a broader linguistic spectrum.
Reduced incidence of unchallenged false claims in languages traditionally underserved by existing AI tools.
Enhanced trust in digital information and potentially a more equitable global participation in knowledge creation and consumption.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL