
arXiv:2605.27401v1 Announce Type: cross Abstract: There is a growing interest in utilizing synthetic populations for a diverse range of applications. At the same time, we are witnessing a tremendous growth in artificial intelligence in all walks of life. This paper evaluates whether zero-shot large language model (LLM)-generated health survey data can serve as inputs to a conventional iterative proportional fitting (IPF) workflow for geographically explicit population synthesis. Using the 2023 Behavioral Risk Factor Surveillance System (BRFSS), we generate synthetic survey records for the U.S.
The rapid advancement of large language models and their increasing sophistication in data generation are enabling new applications in fields like population synthesis.
This development allows for the creation of rich synthetic datasets without costly traditional survey methods, impacting policy, urban planning, and resource allocation.
The ability to generate geographically explicit synthetic populations using LLMs significantly reduces data collection barriers for detailed demographic and behavioral analysis.
- · AI model developers
- · Urban planners
- · Public health researchers
- · Data scientists
- · Traditional survey companies
- · Agencies reliant on outdated demographic data
- · Researchers without AI capabilities
More precise and cost-effective population models become available for various analytical tasks.
Improved policy-making and resource distribution due to enhanced geographical and demographic insights.
Ethical and privacy concerns around synthetic data generation and its potential misuse may escalate, leading to new regulatory frameworks.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI