MARL-CC: A Mathematical Framework forMulti-Agent Reinforcement Learning in ConnectedAutonomous Vehicles: Addressing Nonlinearity,Partial Observability, and Credit Assignment forOptimal Control

Mazyar Taghavi
Javad Vahidi

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Multi-Agent Reinforcement Learning (MARL) has emerged as a powerfulparadigm for cooperative decision-making in connected autonomous vehicles(CAVs); however, existing approaches often fail to guarantee stability, optimality,and interpretability in systems characterized by nonlinear dynamics,partial observability, and complex inter-agent coupling. This study addressesthese foundational challenges by introducing MARL-CC, a unified MathematicalFramework for Multi-Agent Reinforcement Learning with Control Coordination.The proposed framework integrates differential geometric control, Bayesian inference,and Shapley-value-based credit assignment within a coherent optimizationarchitecture, ensuring bounded policy updates, decentralized belief estimation,and equitable reward distribution. Theoretical analyses establish convergence andstability guarantees under stochastic disturbances and communication delays.Empirical evaluations across simulation and real-world testbeds demonstrate upto a 40% improvement in convergence rate and enhanced cooperative efficiencyover leading baselines, including PPO, DDPG, and QMIX.These results signify a decisive advance in control-oriented reinforcement learning,bridging the gap between mathematical rigor and practical autonomy.The MARL-CC framework provides a scalable foundation for intelligent transportation,UAV coordination, and distributed robotics, paving the way toward interpretable, safe, and adaptive multi-agent systems. All codes and experimentalconfigurations are publicly available on GitHub to support reproducibilityand future research.

Version published to 10.21203/rs.3.rs-7996305/v1 on Research Square
Nov 19, 2025

Sequential Cooperative Multi-Agent Online Learning and Adaptive Coordination Control in Dynamic and Uncertain Environments

This article has 6 authors:
1. Limengxi Yue
2. Duo Xu
3. Dong Qiu
4. Yanpei Shi
5. Shuyang Xu
6. Manish Shah
This article has no evaluationsLatest version Jan 12, 2026
Learning Contraction Metrics for Provably Stable Model-Based Reinforcement Learning

This article has 1 author:
1. Amir Hameed Mir
This article has no evaluationsLatest version Jan 19, 2026
Reinforcement Learning for Real-World Non-Stationary Systems: An Observation-Aware Survey

This article has 1 author:
1. Yugam Padha
This article has no evaluationsLatest version Jan 28, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Sequential Cooperative Multi-Agent Online Learning and Adaptive Coordination Control in Dynamic and Uncertain Environments

Learning Contraction Metrics for Provably Stable Model-Based Reinforcement Learning

Reinforcement Learning for Real-World Non-Stationary Systems: An Observation-Aware Survey