
arXiv:2605.28232v1 Announce Type: new Abstract: Occupant comfort and grid-aware energy efficiency are competing objectives whose joint optimization depends critically on how reward functions are specified in deep reinforcement learning (DRL) controllers for buildings. Yet reward design remains largely ad hoc: comfort terms are either hand-tuned heuristics or simple temperature-deviation proxies without explicit grounding in thermal-comfort physics. We present PIRS (Physics-Informed Reward Shaping), which replaces these ad-hoc comfort proxies with the ISO 7730 Predicted Mean Vote (PMV) formulat
The growing integration of AI in critical infrastructure like building management highlights the need for more robust and physics-grounded control systems to optimize energy consumption and occupant comfort.
This development indicates a maturation in AI application for real-world systems, moving from ad-hoc solutions to physics-informed approaches, which enhances reliability and efficiency in energy management.
Reward function design for DRL in building energy management is becoming less heuristic and more systematically grounded in established thermal-comfort physics, leading to more predictable and optimized outcomes.
- · Building automation companies
- · Energy management software providers
- · Occupants of smart buildings
- · AI/ML researchers in control systems
- · Providers of less efficient, heuristic-based control systems
- · Energy waste
Building energy consumption becomes more optimized, potentially reducing operational costs and carbon footprint.
Improved thermal comfort in buildings could lead to increased productivity and occupant satisfaction, further driving adoption of advanced DRL systems.
This approach could be generalized to other complex control systems, enabling more reliable AI deployment across various industrial sectors and potentially alleviating grid strain.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI