Multi-Agent Reinforcement Learning with Two Layer Control Plane for Traffic Engineering

Evgeniy Stepanov
Ruslan Smeliansky
Ivan Garkavy

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The article presents a new method for multi-agent traffic flow balancing. It is based on the MAROH multi-agent optimization method. However, unlike MAROH, the agent's control plane is built on principles of human decision-making and consists of two layers. The first layer ensures autonomous decision-making by the agent based on accumulated experience—representatives of states the agent has encountered and knows which actions to take in them. The second layer enables the agent to make decisions for unfamiliar states. A state is considered familiar to the agent if it is close, in terms of a specific metric, to a state the agent has already encountered. The article explores variants of state proximity metrics and various ways to organize the agent's memory. Experiments demonstrate that the proposed two-layer method demonstrates the efficiency of the single-layer method, accelerates the agent's decision-making process, and reduces inter-agent communications by 84% compared to the single-layer method.

Version published to 10.20944/preprints202508.0818.v1
Aug 12, 2025

Assisting Multi-Agent System Design with MOISE+ and MARL: The MAMAD Method

This article has 5 authors:
1. Julien Soulé
2. Jean-Paul Jamont
3. Michel Occello
4. Louis-Marie Traonouez
5. Paul Théron
This article has no evaluationsLatest version Jul 30, 2025
Performance Optimization of Multi-Agent CooperativeAlgorithms in Basketball Offensive and DefensiveTactics Simulation

This article has 1 author:
1. Ying Ji
This article has no evaluationsLatest version Aug 5, 2025
Reinforcement Learning Approach for Highway Lane-Changing: PPO-Based Strategy Design

This article has 6 authors:
1. Zhichao Ma
2. Yutong Luo
3. Zheyu Zhang
4. Aijia Sun
5. Yinuo Yang
6. Hao Liu
This article has no evaluationsLatest version Jun 25, 2025

Listed in

Abstract

Article activity feed

Related articles

Assisting Multi-Agent System Design with MOISE+ and MARL: The MAMAD Method

Performance Optimization of Multi-Agent CooperativeAlgorithms in Basketball Offensive and DefensiveTactics Simulation

Reinforcement Learning Approach for Highway Lane-Changing: PPO-Based Strategy Design