
arXiv:2511.10404v2 Announce Type: replace Abstract: In spite of the remarkable advancements in the field of Natural Language Processing, the task of Entity Linking (EL) remains challenging in the field of humanities due to complex document typologies, lack of domain-specific datasets and models, and long-tail entities, i.e., entities under-represented in Knowledge Bases (KBs). The goal of this paper is to address these issues with two main contributions. The first contribution is DELICATE, a novel neuro-symbolic method for EL on historical Italian which combines a BERT-based encoder with conte
The continuous advancements in Natural Language Processing (NLP) are pushing the boundaries of what is possible in linguistic analysis, especially concerning historical and culturally specific data.
This development is crucial for cultural institutions, researchers, and AI developers seeking to analyze and preserve historical texts, addressing current limitations in handling diverse document typologies and long-tail entities.
The ability to accurately perform entity linking on complex historical documents, such as those in historical Italian, is improved, opening new avenues for digital humanities and AI application in less-resourced domains.
- · Digital Humanities Researchers
- · Libraries and Archives
- · NLP Developers
- · Cultural Preservation Organizations
- · Traditional manual annotation methods
Improved accessibility and searchability of historical texts through enhanced entity recognition.
New insights derived from historical data analyses that were previously inaccessible or too resource-intensive.
Enhanced AI models trained on richer, more accurate historical datasets, leading to broader applications in social sciences.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL