Cost-Optimal Coordination for Peak Demand Reduction in Saudi Residential Buildings Using Physics-Informed Deep Reinforcement Learning
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
When multiple On/Off split air-conditioning units in Saudi residential buildings ac-tivate simultaneously, the resulting peak demand spike stresses the electrical gridand inflates monthly bills under the kingdom’s two-tier tariff (0.18 SAR/kWh ≤6,000 kWh; 0.30 SAR/kWh above). This paper proposes a Physics-Informed ProximalPolicy Optimization (PI-PPO) framework that learns a stationary scheduling policy—applicable over an infinite time horizon without re-solving any optimization—to co-ordinate 18,500 BTU On/Off split units (1.8 kW input, EER 10.25) across multiplezones. Each zone is abstracted as a scheduling task with formally analyzed minimumutilization and feasibility conditions. The model incorporates inter-zone thermal cou-pling, enabling the scheduler to exploit thermal buffering through shared walls. PI-PPO embeds heat balance equations directly into the reinforcement learning reward,yielding a controller that maintains thermal comfort within the specified bounds at alltimes—a guarantee absent from standard deep reinforcement learning methods. Wefurther show that extending the comfort range by ±1◦C (from 23–25◦C to 22–26◦C)reduces each zone’s minimum utilization by 36.9%. Simulations using EnergyPluswith Jeddah weather data across four months (January, April, July, October) showthat PI-PPO reduces peak demand by 40–60% and July cost by 22.5% for a 5-zonevilla, rising to 47.0% for a 20-zone compound with comfort extension. Ablation stud-ies attribute 6.0 percentage points to physics-informed shaping, 4.5 pp to tiered-tariffawareness, 2.0 pp to inter-zone coupling, and 14.0 pp to comfort extension.