Synthetic Personalities: How Well Can LLMs Mimic Individual Respondents Using Socio-Economic Microdata?

arXiv:2606.04592v1 Announce Type: cross Abstract: LLM-based digital twins promise to scale and accelerate market research, but most published twins are either coarse persona bots conditioned on a few demographic questions or detailed individual-level twins built on purpose-collected surveys and interview transcripts. Neither setup speaks to the operationally most relevant case for marketing practice: building detailed individual twins from the pre-existing heterogeneous panel data that firms already accumulate through CRM systems, loyalty programs, and repeat surveys. We construct detailed ind
Advances in large language models are enabling more sophisticated mimcry of human behavior, making 'synthetic personalities' a current area of research interest. The increasing availability of granular socio-economic microdata also provides the necessary inputs.
The ability to create detailed individual digital twins from existing enterprise data has significant implications for market research, personalized marketing, and potentially for broader applications of AI agents. It represents a foundational step towards highly granular, automated interactions at scale.
Market research and customer interaction strategies could shift from statistical aggregates and broad personas to highly individualized, AI-driven simulations and engagements. This changes data utility and personalization capabilities.
- · Market research firms leveraging LLMs
- · Companies with rich CRM and loyalty program data
- · AI platform providers
- · Personalized marketing and advertising
- · Traditional survey-based market research
- · Generic persona-driven marketing
- · Data brokers selling coarse demographic data
Companies will gain unprecedented ability to model and predict individual customer behavior based on their existing data.
This deep understanding of individual consumers could lead to highly targeted, almost predictive, product development and advertising campaigns, further blurring lines between influence and experience.
The ethical and regulatory landscape around data privacy, deep profiling, and potential algorithmic manipulation will be significantly challenged, potentially leading to new legislation or public backlash.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI