SeDT: Sentence-Transformer Decision-Transformer Conditioning for Multi-Turn Conversation Reliability

arXiv:2605.26788v1 Announce Type: new Abstract: Large language models (LLMs) achieve impressive performance when a task is fully specified in a single turn, yet the same models lose up to 39% of that performance when the identical task is revealed incrementally across multiple turns, a phenomenon documented at scale as Lost in Conversation. Crucially, this collapse is almost entirely a reliability failure; the best case, the aptitude only falls 16%, while the unreliability more than doubles (+112%). We argue that the root cause is structural, a flat conversation history assigns equal implicit
The proliferation and increasing complexity of LLM applications highlight the critical need for robust multi-turn conversational capabilities to unlock their full potential in real-world scenarios.
Reliable multi-turn conversations are essential for the widespread adoption and integration of LLMs into complex workflows and agentic systems, directly impacting their utility and economic value.
The focus is shifting from single-turn LLM performance to solving multi-turn conversational reliability issues, addressing a significant bottleneck for enterprise adoption and agent development.
- · LLM developers
- · AI agent orchestrators
- · Enterprise software reliant on conversational AI
- · LLMs with poor multi-turn conditioning
- · Developers neglecting conversational reliability
- · AI applications requiring extensive human supervision for multi-turn interaction
Increased reliability in multi-turn LLM interactions allows for more complex and autonomous AI applications.
Improved conversational reliability accelerates the development and deployment of sophisticated AI agents across various sectors.
Enhanced agentic capabilities lead to significant automation of white-collar tasks, impacting labor markets and enterprise efficiency.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL