
arXiv:2606.02737v1 Announce Type: cross Abstract: Dense retrieval models exhibit positional bias: retrieval effectiveness degrades when relevant information appears later in a passage (Zeng et al., 2025). We ask whether this bias can be reduced at inference time, without retraining and without sacrificing overall retrieval effectiveness. To this end, we adapt inference-time attention calibration (Schuhmacher et al., 2026) to downstream retrieval and extend it with a strength coefficient lambda that interpolates between the original and fully calibrated attention distributions. Across three emb
The proliferation of dense retrieval models in AI systems necessitates addressing their inherent positional biases to ensure reliable information access.
Improving the fairness and effectiveness of dense information retrieval at inference time enhances AI accuracy and mitigates critical errors in applications ranging from search to autonomous agents.
The ability to calibrate attention for position-fairness without retraining changes the landscape for deploying robust AI retrieval systems, reducing development overhead and improving real-world performance.
- · AI developers
- · Information retrieval systems
- · Users of AI search/Q&A systems
- · AI accuracy and reliability
- · Systems with high positional bias
- · Inefficient AI development cycles
Wider adoption of more reliable dense retrieval models across various AI applications.
Increased trust in AI systems due to fairer and more accurate information processing.
Reduced 'hallucination' rates and critical error incidence in generative AI systems relying on retrieved information.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL