LLM Doesn't Know What It Doesn't Know: Detecting Epistemic Blind Spots via Cross-Model Attribution Divergence on Clinical Tabular Data

arXiv:2606.19509v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly applied to structured clinical data, yet whether they can recognize the limits of their own knowledge on such tasks remains unexplored. We study this question through the lens of cross-model attribution divergence with the goal of reducing epistemic uncertainty for structured tasks, comparing Qwen 2.5 7B and XGBoost on a prediction task via attribution divergence analysis. We report four findings. First, LLM verbalized confidence is epistemically vacuous, it outputs a near-constant (0.856-0.937) regar
The increasing deployment of LLMs in critical domains like healthcare necessitates a deeper understanding of their reliability, especially concerning their self-awareness of limitations.
This research highlights a fundamental flaw in current LLM deployment, where stated confidence does not correlate with actual accuracy, posing risks in high-stakes applications.
Confidence metrics from LLMs are now shown to be unreliable indicators of epistemic uncertainty, requiring new methods for assessing trustworthiness in AI outputs.
- · AI safety researchers
- · Explainable AI (XAI) developers
- · Healthcare AI regulatory bodies
- · LLM developers relying on verbalized confidence
- · Applications deploying unvalidated LLM outputs
- · Patients trusting unverified clinical AI predictions
Immediate re-evaluation of LLM confidence mechanisms and their use in decision-making.
Increased demand for robust uncertainty quantification and 'does-not-know' detection mechanisms in AI models.
Potential slowdown in the adoption of LLMs in sensitive domains until these epistemic limitations are adequately addressed.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI