XRL-LLM: Explainable Reinforcement Learning Framework for Voltage Control

Shrenik Jadhav
Birva Sevak
Van-Hai Bui

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Reinforcement learning (RL) agents are increasingly deployed for voltage control in power distribution networks. However, their opaque decision-making creates a significant trust barrier, limiting their adoption in safety-sensitive operational settings. This paper presents XRL-LLM, a novel framework that generates natural language explanations for RL control decisions by combining game-theoretic feature attribution (KernelSHAP) with large language model (LLM) reasoning grounded in power systems domain knowledge.We deployed a Proximal Policy Optimization (PPO) agent on an IEEE 33-bus network to coordinate capacitor banks, tap changers, and shunt regulators, successfully reducing voltage violations by 90.5% across diverse loading conditions. To make these decisions interpretable, KernelSHAP identifies the most influential state features. These features are then processed by a domain-context-engineered LLM prompt that explicitly encodes network topology, device specifications, and ANSI C84.1 voltage limits.Evaluated via G-Eval across 30 scenarios, XRL-LLM achieves an explanation quality score of 4.13/5. This represents a 33.7% improvement over template-based generation and a 67.9% improvement over raw SHAP outputs, delivering statistically significant gains in accuracy, actionability, and completeness (p< 0.001, Cohen’s d values up to 4.07). Additionally, a physics-grounded counterfactual verification procedure which perturbs the underlying power flow model, confirms a causal faithfulness of 0.81 under critical loading.

Version published to 10.20944/preprints202603.1131.v1
Mar 16, 2026

Decomposable Reward Modeling and Realistic Environment Design for Reinforcement Learning-Based Forex Trading

This article has 1 author:
1. Nabeel Ahmad Saidd
This article has no evaluationsLatest version Mar 23, 2026
A Brief Tutorial on Reinforcement Learning: From MDP to DDPG

This article has 2 authors:
1. Tian Zhang
2. Zhirong Su
This article has no evaluationsLatest version Feb 20, 2026
Large Language Models for Reinforcement Learning: A Survey of Intervention Operators and Optimization Effects

This article has 3 authors:
1. Kourosh Shahnazari
2. Seyed Moein Ayyoubzadeh
3. Mohammadali Keshtparvar
This article has no evaluationsLatest version Mar 3, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Decomposable Reward Modeling and Realistic Environment Design for Reinforcement Learning-Based Forex Trading

A Brief Tutorial on Reinforcement Learning: From MDP to DDPG

Large Language Models for Reinforcement Learning: A Survey of Intervention Operators and Optimization Effects