Beyond the Mean: Three-Axis Fidelity for Aligning LLM-Based Survey Simulators from Small Pilot Data

arXiv:2606.28963v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly used to simulate social survey responses, yet their outputs exhibit systematic biases: marginal distributions are skewed, response variance is poorly calibrated, and predictor-outcome relationships are attenuated. We ask a simple question: given a small pilot sample of human responses, can an LLM recover the statistical characteristics of a broader population? We decompose recovery along three axes: structural fidelity, marginal fidelity, and individual fidelity. Using a COVID-19 misinformation surv
The proliferation of LLMs in social science research necessitates understanding and mitigating their inherent biases for accurate simulation, which aligns with ongoing efforts to refine AI applications.
This research provides a framework for aligning LLMs with human response patterns, crucial for applications ranging from market research to policy simulations, thereby increasing the reliability of AI-generated insights.
The ability to accurately calibrate LLM-based survey simulators with small pilot data will enable more robust and nuanced social simulations, moving beyond simple data generation to statistical fidelity.
- · Social science researchers
- · AI developers
- · Market research firms
- · Policy makers
- · Uncalibrated LLM simulation methodologies
- · Organizations relying on biased LLM survey data
LLMs will be used more confidently for social surveys, leading to faster and potentially cheaper data collection methods.
Improved LLM fidelity could enable the creation of synthetic datasets that more accurately reflect human populations, aiding in privacy-preserving research.
The enhanced realism of simulated populations might lead to new ethical considerations around digital personas and their influence on public discourse.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG