
arXiv:2606.29490v1 Announce Type: new Abstract: Confidence is an estimate of the probability that a chosen answer is correct. Verbal confidence reports are widely used as uncertainty measures in large language models, but whether they are best understood as estimates of correctness is unclear. We test this with a two-stage abstention paradigm from the neuroscience of perceptual decision making: a model first answers and reports its confidence, then decides whether to commit it to a user or abstain. Across four non-reasoning models, prompt framings, and confidence formats, verbal confidence pre
The proliferation and integration of LLMs into critical applications necessitate a deeper understanding of their internal states and reliability given their black-box nature.
This research reveals a fundamental disconnect between reported LLM confidence and actual correctness, impacting trust, safety, and the efficacy of agentic AI systems.
Current methods for assessing LLM reliability based on verbal confidence reports are shown to be flawed, requiring new approaches for robust uncertainty quantification.
- · AI safety researchers
- · Developers of new LLM calibration techniques
- · Companies using LLMs in high-stakes environments
- · LLMs relying solely on verbal confidence for risk assessment
- · Users blindly trusting LLM confidence scores
- · Early stage AI agent deployments without robust uncertainty handling
Demand will increase for robust, explicit uncertainty quantification methods for large language models.
New architectural designs or training objectives for LLMs may emerge to better align confidence with correctness.
Regulatory frameworks for AI will likely incorporate stricter requirements for verifiable uncertainty and risk reporting from deployed models.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG