Cost-Optimal Coordination for Peak Demand Reduction in Saudi Residential Buildings Using Physics-Informed Deep Reinforcement Learning

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

When multiple On/Off split air-conditioning units in Saudi residential buildings ac-tivate simultaneously, the resulting peak demand spike stresses the electrical gridand inflates monthly bills under the kingdom’s two-tier tariff (0.18 SAR/kWh ≤6,000 kWh; 0.30 SAR/kWh above). This paper proposes a Physics-Informed ProximalPolicy Optimization (PI-PPO) framework that learns a stationary scheduling policy—applicable over an infinite time horizon without re-solving any optimization—to co-ordinate 18,500 BTU On/Off split units (1.8 kW input, EER 10.25) across multiplezones. Each zone is abstracted as a scheduling task with formally analyzed minimumutilization and feasibility conditions. The model incorporates inter-zone thermal cou-pling, enabling the scheduler to exploit thermal buffering through shared walls. PI-PPO embeds heat balance equations directly into the reinforcement learning reward,yielding a controller that maintains thermal comfort within the specified bounds at alltimes—a guarantee absent from standard deep reinforcement learning methods. Wefurther show that extending the comfort range by ±1◦C (from 23–25◦C to 22–26◦C)reduces each zone’s minimum utilization by 36.9%. Simulations using EnergyPluswith Jeddah weather data across four months (January, April, July, October) showthat PI-PPO reduces peak demand by 40–60% and July cost by 22.5% for a 5-zonevilla, rising to 47.0% for a 20-zone compound with comfort extension. Ablation stud-ies attribute 6.0 percentage points to physics-informed shaping, 4.5 pp to tiered-tariffawareness, 2.0 pp to inter-zone coupling, and 14.0 pp to comfort extension.

Article activity feed