Neuro-Fuzzy Enhanced Deep Reinforcement Learning for Adaptive Urban Traffic Signal Control

Michael Bamidele Soroyewun
Afolayan Ayodele Obiniyi
Sahalu Balarabe Junaidu
Emmanuel Adewale Adedokun

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Deep Reinforcement Learning (DRL) has become a leading paradigm for adaptive traffic signal control, yet baseline implementations suffer from eight structural limitations spanning action discretization, reward noise, state interpretability, phase rigidity, undirected exploration, Q-value uncertainty, experience replay weighting, and sensor robustness. This paper proposes NF_PN_D3QN, a Neuro-Fuzzy enhanced extension of the Prioritized Noisy Dueling Double Deep Q-Network (PN_D3QN), which addresses all eight gaps through a unified neuro-fuzzy framework incorporating a Fuzzy Feature Extractor, Mamdani Reward Shaper, Green Duration FIS (Fuzzy Inference System), Phase Urgency scorer, Exploration Policy, softmax-entropy Q-Confidence gate, Fuzzy Priority PER (Prioritized Experience Replay), and Fuzzy Sensor Model. A methodological contribution is the documented three-version evolution of the Q-confidence gate: absolute gap thresholding produced 97% FIS dominance and no effective learning; relative gap normalization reduced but did not eliminate FIS persistence at convergence; softmax entropy correctly identified genuine network uncertainty, allowing FIS deferral to decline naturally from 11% at episode 1 to 0% by episode 30. Experiments across five traffic scenarios in Simulation of Urban Mobility (SUMO) show that NF_PN_D3QN achieves a last-25-episode mean waiting time of 6.40 seconds, statistically equivalent to PN_D3QN's converged 6.07 seconds, confirming that all DRL methods share a common performance ceiling well below Max-Pressure's 13.0 seconds. NF_PN_D3QN's primary advantage is sample efficiency: deployable performance is reached in 15 to 20 episodes versus 80 to 90 for PN_D3QN and 150 or more for D3QN, a four to eight times improvement with direct implications for live deployment where poor early decisions affect real commuters.

Version published to 10.21203/rs.3.rs-9396900/v1 on Research Square
Apr 14, 2026

Multi-Agent Deep Reinforcement Learning with Contrastive Policy Diversification and Hierarchical Graph Networks for Urban Traffic Signal Control

This article has 5 authors:
1. Liping Yan
2. Haojie Jia
3. Shaofeng Wang
4. Peiran Wu
5. Wenzhi Zhao
This article has no evaluationsLatest version Mar 31, 2026
Deep Reinforcement Learning–Assisted Cubature Kalman Filtering for Robust Multi-Rate Dynamic State Estimation Under False Data Injection Attacks

This article has 4 authors:
1. Mingxing Qiao
2. Dongjian Huang
3. Wenyuan Wang
4. Wenxu Yan
This article has no evaluationsLatest version Apr 9, 2026
Energy-Aware Autonomous UAV Navigation via Deep Reinforcement Learning: DQN, PPO, and SAC with Battery-Constrained Reward

This article has 1 author:
1. Sayeed Omar
This article has no evaluationsLatest version Apr 16, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Multi-Agent Deep Reinforcement Learning with Contrastive Policy Diversification and Hierarchical Graph Networks for Urban Traffic Signal Control

Deep Reinforcement Learning–Assisted Cubature Kalman Filtering for Robust Multi-Rate Dynamic State Estimation Under False Data Injection Attacks

Energy-Aware Autonomous UAV Navigation via Deep Reinforcement Learning: DQN, PPO, and SAC with Battery-Constrained Reward