Where to Place the Query? Unveiling and Mitigating Positional Bias in In-Context Learning for Diffusion LLMs via Decoding Dynamics

arXiv:2606.19349v1 Announce Type: new Abstract: While In-Context Learning (ICL) is extensively studied in Autoregressive (AR) LLMs, its mechanism within Diffusion Large Language Models (dLLMs) remains largely unexplored. Unlike AR models restricted by unidirectional causal masking, dLLMs intrinsically utilize bidirectional attention, offering extensive spatial flexibility for query placement. Unfortunately, current practices conventionally inherit AR-style trailing-query templates, often overlooking the structural paradigm shift. This paper presents a comprehensive analysis unveiling that quer
This research addresses a fundamental issue in the emerging Diffusion Large Language Models (dLLMs) that has been overlooked due to conventional practices inherited from Autoregressive (AR) LLMs.
Understanding and mitigating positional bias in dLLMs can significantly improve their efficiency and performance, potentially accelerating their adoption and expanding their application space.
The explicit recognition and analysis of positional bias in dLLMs, coupled with proposed mitigation strategies, will lead to more optimized and effective dLLM architectures and practical deployments.
- · AI researchers
- · dLLM developers
- · Companies deploying dLLMs
- · Less optimized dLLM applications
- · Developers ignoring dLLM specificities
Improved performance and reliability of dLLMs become more widespread.
Increased adoption of dLLMs in applications where their unique strengths (like bidirectional attention) can be leveraged over AR models.
A subtle but significant shift in LLM architectural design principles, moving away from AR-model assumptions across the broader AI landscape.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL