Generating the Modal Worker: A Cross-Model Audit of Race and Gender in LLM-Generated Personas Across 41 Occupations

arXiv:2510.21011v3 Announce Type: replace-cross Abstract: As generative AI tools are increasingly used to portray people in professional roles, understanding their racial and gender representational biases is critical. We audit over 1.5 million occupational personas generated by four major large language models (GPT-4, Gemini 2.5, DeepSeek V3.1, and Mistral-medium) across 41 U.S. occupations. Comparing these personas against U.S. Bureau of Labor Statistics (BLS) data, we find that models generate demographics with less variation than real-world data, functionally compressing each occupation to
The proliferation of advanced LLMs like GPT-4 and Gemini 2.5, and their increasing use in professional contexts, makes auditing their biases critical at this juncture.
This research reveals how AI-generated personas compress demographic diversity, potentially embedding and amplifying societal biases across a wide range of professional representations.
Understanding these biases provides a clearer picture of risks in AI-augmented decision-making and content generation, informing the development of more equitable AI systems.
- · AI ethicists and researchers
- · Organizations developing bias mitigation techniques
- · Regulators focused on AI fairness
- · Companies unknowingly deploying biased AI tools
- · Individuals misrepresented by AI-generated personas
- · Workforces compressed into 'modal' stereotypes
Increased scrutiny and demand for transparency in LLM training data and output generation processes.
Development of new AI models specifically designed for demographic diversity and representation across professions.
Potential for legislation or industry standards requiring demographic auditing of AI systems before deployment in sensitive applications.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI