Closing the Gap at CRAC 2026: Two-Stage Adaptation for LLM-Based Multilingual Coreference Resolution

arXiv:2605.16984v2 Announce Type: replace Abstract: We present our submission to the LLM track of the 2026 Computational Models of Reference, Anaphora and Coreference (CRAC 2026) shared task. With an average CoNLL F1 score of 74.32 on the official test set, our system ranked first in the LLM track, and third overall. Our system is based on the Gemma-3-27b model, fine-tuned using a two-stage strategy with a multilingual base adapter followed by dataset-specific adapters. We represent mention spans by their headword using an XML-inspired format with local reindexing and annotate documents iterat
The rapid advancements in large language models are leading to breakthroughs in complex natural language processing tasks like coreference resolution, as demonstrated by the CRAC 2026 shared task results.
Improved coreference resolution for LLMs enhances their understanding of context and relationships within texts, which is critical for their accuracy and reliability in various applications.
This advancement indicates a significant improvement in the ability of LLMs to interpret nuanced textual references, potentially leading to more sophisticated AI agents and better information extraction.
- · AI developers
- · NLP researchers
- · Google (Gemma model)
- · AI application providers
- · Legacy NLP systems
- · Manual data annotation processes
LLMs will demonstrate enhanced capabilities in understanding complex documents and conversations.
This improved understanding will accelerate the development and deployment of more effective AI agents and automated content summarization tools.
The widespread adoption of highly capable coreference-aware AI systems could redefine knowledge management and information retrieval across industries.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL