Deep Reinforcement Learning Approaches the MILP Optimum of a Multi-Energy Optimization in Energy Communities

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

As energy systems transition toward high shares of variable renewable generation, local energy communities (ECs) are increasingly relevant for enabling demand-side flexibility and self-sufficiency. This shift is particularly evident in the residential sector, where the deployment of photovoltaic (PV) systems is rapidly growing. While mixed-integer linear programming (MILP) remains the standard for operational optimization and demand response in such systems, its computational burden limits scalability and responsiveness under real-time or uncertain conditions. Reinforcement learning (RL), by contrast, offers a model-free, adaptive alternative. However, its application to real-world energy system operation remains limited. This study explores the application of a Deep Q-Network to a real residential EC, which has received limited attention in prior work. The system comprises three single-family homes sharing a centralized heating system with a thermal energy storage (TES), a PV installation, and grid connection. We compare the performance of MILP and RL controllers across economic and environmental metrics. Relative to a reference scenario without TES, MILP and RL reduce energy costs by 10.06% and 8.75%, respectively, and both approaches yield lower total energy consumption and CO2-equivalent emissions. Notably, the trained RL agent achieves a near-optimal outcome while requiring only 22% of the MILP’s computation time. These results demonstrate that Double Deep Q-Learning can offer a computationally efficient and practically viable alternative to MILP for real-time control in residential energy systems.

Article activity feed