A Comprehensive Survey of Multi-Agent Reinforcement Learning for Autonomous Systems: Algorithms, Applications, and Open Challenges

Adnan Khalid Bhatti
Muhammad Shahzad Mughal

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Autonomous systems are ever-performing tasks in complicated and multi-agent conditions in which coordination, scalability, safety, and reliability are key demands. In these environments, non-stationarity of the climate, decentralized information, and closely interacting agent dynamics are fundamental limitations to traditional single-agent reinforcement learning. Multi-agent reinforcement learning (MARL) has become an effective framework for addressing these issues, enabling agents to adopt cooperative, competitive, or a combination of strategies through interaction with one another. Nevertheless, even with the breakthroughs in the field of algorithms, the scalability of MARL remains limited in its application to real-world autonomous systems due to scalability constraints, communication assumptions, safety considerations, and theoretical guarantees of its success. The paper is a survey of MARL in autonomous systems in a critical and detailed manner, focusing on the algorithmic and application-driven perspectives. It is based on a systematic literature review methodology that is used to gather, filter, and group recent peer-reviewed works by MARL paradigms and autonomous system areas. MARL algorithms are divided into value-style, policy-style, and a hybrid style, and their advantages and disadvantages and implementation implications are comparatively discussed. The use in autonomous vehicles, UAV swarms, multi-robot systems, and industrial autonomous environments is discussed to underscore domain-specific limitations associated with the coordination, communication, and safety. The survey reveals through cross-domain synthesis that there are still persistent issues, such as scaling to large and heterogeneous sets of agents, the use of an idealized communication model, weak sim-to-real transfer, limited interpretability, and failure to achieve general convergence and safety guarantees. Lastly, the most important gaps in research are presented, and the future direction is proposed to make MARL a safer, more interpretable, and deployable autonomous system. This survey represents a structured and critical source of guidance to researchers and practitioners who wish to build robust MARL-enabled autonomy beyond evaluation processes within simulations.

Version published to 10.21203/rs.3.rs-8751804/v1 on Research Square
Feb 18, 2026

Social Learning Dynamics in Multi-Agent Systems: A Framework for Collective Knowledge Building

This article has 4 authors:
1. Safiye Turgay
2. Sena Nur Adıyaman
3. Ayşe Ünlü
4. Pankaj Bhambri
This article has no evaluationsLatest version Jan 28, 2026
A Brief Survey of Deep Reinforcement Learning Algorithms for Autonomous Systems

This article has 7 authors:
1. Maxwell Khan
2. Jackson Reynolds
3. Madison Taylor
4. Caleb Walker
5. Savannah Mitchell
6. Ethan Carter
7. Emma Davis
This article has no evaluationsLatest version Jan 22, 2026
Reinforcement Learning for Real-World Non-Stationary Systems: An Observation-Aware Survey

This article has 1 author:
1. Yugam Padha
This article has no evaluationsLatest version Jan 28, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Social Learning Dynamics in Multi-Agent Systems: A Framework for Collective Knowledge Building

A Brief Survey of Deep Reinforcement Learning Algorithms for Autonomous Systems

Reinforcement Learning for Real-World Non-Stationary Systems: An Observation-Aware Survey