Aligned but Stereotypical? How System Prompts Shape Demographic Bias in LLM-Based Text-to-Image Models

arXiv:2512.04981v2 Announce Type: replace-cross Abstract: Text-to-image (T2I) systems increasingly rely on Large Language Model (LLM)-based text conditioning to interpret and expand user prompts. While this improves prompt understanding and text-image alignment, we find that it can also introduce implicit demographic assumptions, even when demographic attributes are unspecified. To systematically investigate this behavior across varying levels of prompt ambiguity and complexity, we construct a comprehensive benchmark covering diverse prompt settings. Evaluations on eight recent T2I models show
The increasing sophistication and integration of LLMs into text-to-image systems make the examination of their inherent biases critical as these technologies move toward broader deployment.
This research reveals that even advanced AI models can amplify demographic biases through implicit assumptions, posing significant ethical and societal risks for AI development and deployment.
The understanding that LLM-based text conditioning in T2I models can introduce demographic biases, even with seemingly neutral prompts, necessitates a more rigorous approach to bias detection and mitigation at the architectural level.
- · AI ethics researchers
- · Fairness-aware AI developers
- · Regulatory bodies
- · Uncritically deployed T2I models
- · Users impacted by stereotypical outputs
- · Companies relying on unmitigated T2I systems
Increased scrutiny and demand for debiasing techniques in AI models, especially those used for content generation.
Development of new benchmarking standards and auditing processes specifically for demographic bias in multimodal AI systems.
Potential for regulatory frameworks to mandate bias testing and transparency for AI systems used in public-facing applications.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG