When Is Next-Token Prediction Useful? Marginalization, Ergodicity, Mixture Identifiability, Local Sufficiency, RAG, Tools, and Programming

arXiv:2605.23278v1 Announce Type: new Abstract: Language models trained on observed sequences are often described as learning the conditional distribution of the next token given previous tokens. This description is only conditionally correct. A model trained on realized token trajectories does not observe full conditional laws; it receives sampled continuations. Moreover, real language generation is conditioned not only on previous words but also on non-textual circumstances: facts, events, intentions, goals, beliefs, social context, and task-specific constraints. This paper distinguishes thr
The rapid advancement and widespread deployment of large language models necessitate a deeper theoretical understanding of their core mechanisms and limitations.
This paper challenges fundamental assumptions about how LLMs learn, informing more robust development, deployment, and risk assessment for AI systems.
The understanding of 'next-token prediction' as the sole or primary learning mechanism for LLMs is refined, highlighting the role of embodied and contextual learning in real-world scenarios.
- · AI researchers focusing on embodied AI and contextual learning
- · Developers building robust, real-world AI applications
- · Companies investing in multimodal and situated AI systems
- · Simplified views of LLM capabilities
- · Models reliant solely on next-token prediction for complex tasks
- · Applications misinterpreting LLM understanding based on text generation
It will drive innovation in AI architectures that better integrate non-textual context and real-world interactions.
This shift could accelerate the development of more truly 'agentic' AI systems capable of understanding and navigating complex environments beyond textual data.
Improved theoretical understanding may lead to more accountable and explainable AI, as the limitations and true learning mechanisms become clearer.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL