Synthetic Homes: A Multimodal Generative AI Pipeline for Residential Building Data Generation under Data Scarcity

arXiv:2509.09794v5 Announce Type: replace-cross Abstract: Computational models have emerged as powerful tools for multi-scale energy modeling research at the building and urban scale, supporting data-driven analysis across building and urban energy systems. However, these models require large amounts of building parameter data that is often inaccessible, expensive to collect, or subject to privacy constraints. We introduce a modular, multimodal generative Artificial Intelligence (AI) framework that integrates image, tabular, and simulation-based components and produces synthetic residential bu
The increasing availability of generative AI models coincides with persistent data scarcity issues in crucial sectors like urban planning and energy modeling.
This development addresses a fundamental bottleneck in data-driven simulations, enabling more accurate and scalable urban and energy planning despite real-world data limitations.
The ability to generate high-fidelity synthetic building data across multiple modalities will accelerate research and application in smart cities, sustainable development, and infrastructure management.
- · Urban planners
- · Energy utilities
- · AI model developers
- · Real estate tech
- · Traditional data collection services
- · Manual data analysts
Researchers gain access to large, diverse datasets for energy modeling previously unavailable.
Improved predictive models lead to more efficient and sustainable building designs and urban infrastructure.
The methodology could be extended to address data scarcity in other sensitive or complex domains, spurring broader AI application.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG