Layer-Resolved Optimal Transport for Hallucination Detection in NMT and Abstractive Summarization

arXiv:2606.13216v1 Announce Type: new Abstract: Optimal transport (OT) has been shown to detect hallucinations in neural machine translation (NMT) by measuring the geometric distance between cross-attention distributions and a reference distribution, without any supervision. We extend this analysis to all six decoder layers of the Fairseq DE-EN model ($N=3{,}414$), showing that Wass-to-Unif and Wass-to-Data are complementary detectors specialised across hallucination types, that detection is concentrated in layers L1--L4 with L5 anti-predictive for subtler types, and that hallucinated translat
The proliferation of advanced neural machine translation and abstractive summarization models necessitates robust methods for detecting and mitigating hallucinations, which this research directly addresses.
Improving the reliability of AI-generated content, particularly in critical applications like NMT, is crucial for wider adoption and trust in autonomous AI systems.
The ability to more effectively detect hallucinations within AI models 'black box' at an architectural layer-resolved level provides new tools for model interpretability and safety.
- · AI model developers
- · NLP researchers
- · Industries relying on NMT and summarization
- · AI safety researchers
- · AI systems prone to generating hallucinations unchecked
More reliable neural machine translation and abstractive summarization outputs due to enhanced hallucination detection.
Improved trust and broader adoption of AI-driven content generation in sensitive domains.
Reduced 'hallucination events' could accelerate the deployment of autonomous AI agents in high-stakes environments, potentially collapsing workflows faster.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL