
arXiv:2604.04944v2 Announce Type: replace-cross Abstract: Multiple-choice questions (MCQs) are widely used to evaluate large language models (LLMs). However, LLMs remain vulnerable to the presence of plausible distractors. This often diverts attention toward irrelevant choices, resulting in unstable oscillation between correct and incorrect answers. In this paper, we propose Inclusion-of-Thoughts (IoT), a progressive self-filtering strategy that is designed to mitigate this cognitive load (i.e., instability of model preferences under the presence of distractors) and enable the model to focus m
The continuous improvement of large language models is a central focus of AI research, and methods to enhance their robustness and decision-making stability are actively being sought to address current limitations.
Improving LLM stability and decision-making under uncertainty, especially with distractors, directly impacts their reliability and applicability in complex, real-world tasks where accuracy and consistency are paramount.
This research introduces a novel self-filtering strategy that may improve LLM performance on complex questions by reducing 'cognitive load,' leading to more stable and accurate responses.
- · AI researchers
- · LLM developers
- · Industries relying on LLM decision-making
- · LLMs lacking robust self-correction mechanisms
- · Benchmark tests overly reliant on simple MCQs
LLMs trained with Inclusion-of-Thoughts show improved performance on multiple-choice questions despite plausible distractors.
More reliable and less 'hallucinatory' LLMs could accelerate adoption in critical applications like medical diagnostics or legal research.
The development of more advanced self-filtering and reasoning techniques could contribute to the path towards more truly autonomous AI agents.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI