
arXiv:2606.13607v1 Announce Type: new Abstract: When large language models (LLMs) fail to generalize or make haphazard errors in reasoning, it is often taken as evidence that LLMs are not truly reasoning, but rather performing a kind of pattern matching. The implication is that people's behavior does not exhibit the same types of failures because human reasoning uses principled and abstract world models. We evaluate human participants and 25 LLMs on their ability to engage in common-sense reasoning about a variety of everyday situations and observe similar patterns of errors in both people and
This research provides contemporary scientific evidence comparing human and LLM reasoning patterns, aligning with the current high-stakes advancements and critiques of AI capabilities.
Understanding the shared failure modes between human and LLM reasoning is crucial for designing more robust AI systems and for calibrating expectations about AI's cognitive abilities.
The conventional distinction between human 'reasoning' and LLM 'pattern matching' is significantly blurred, suggesting a more unified cognitive framework is needed for both.
- · AI ethicists
- · Cognitive scientists
- · LLM developers focusing on robust generalization
- · AI developers overstating LLM reasoning superiority
- · Philosophers advocating for purely abstract human reasoning
- · Applications reliant on perfect, context-free LLM reasoning
This research will spark further investigation into the fundamental mechanisms of intelligence, both artificial and natural.
It will lead to the development of new benchmarks and evaluation methods that better capture and compare reasoning across different intelligent systems.
This could ultimately inform the architecture of future autonomous AI agents, making them more resilient to common-sense failures.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI