Persona Non Grata: LLM Persona-Driven Generations in MCQA are Unstable in Distinct Dimensions

arXiv:2607.00937v1 Announce Type: new Abstract: Persona-driven generations (PDGs) have seen prolific use in research and industry applications, where a large language model (LLM) takes on a 'persona' while completing some task. While persona expressed through free-form text (like dialogue) has substantial work investigating stability or consistency, relatively, persona expressed in non-text-heavy outputs (like in multiple-choice question answering, or MCQA) is often overlooked. We work to address this gap, seeking to understand the instability of LLM PDGs in MCQA tasks. We develop three metric
The proliferation of persona-driven LLMs in various applications necessitates a deeper understanding of their reliability, especially as their use extends beyond pure dialogue into more structured tasks.
Understanding LLM persona instability in non-text-heavy outputs is crucial for the dependable deployment of AI agents and systems, particularly in sensitive decision-making roles.
This research provides metrics and insights into a previously overlooked aspect of LLM consistency, potentially leading to more robust and predictable AI persona development.
- · AI developers focused on reliability
- · Industries deploying LLMs in critical applications
- · AI safety researchers
- · Developers ignoring persona stability
- · Applications relying on unchecked persona consistency
It provides a framework for evaluating and improving the stability of LLM personas in specific task contexts.
Improved persona stability will enhance trust and expand the types of applications where AI agents can be reliably deployed.
This could accelerate the integration of AI agents into complex systems, requiring higher degrees of predictable and consistent behavior.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL