A Comprehensive Survey of Multi-Agent Reinforcement Learning for Autonomous Systems: Algorithms, Applications, and Open Challenges

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Autonomous systems are ever-performing tasks in complicated and multi-agent conditions in which coordination, scalability, safety, and reliability are key demands. In these environments, non-stationarity of the climate, decentralized information, and closely interacting agent dynamics are fundamental limitations to traditional single-agent reinforcement learning. Multi-agent reinforcement learning (MARL) has become an effective framework for addressing these issues, enabling agents to adopt cooperative, competitive, or a combination of strategies through interaction with one another. Nevertheless, even with the breakthroughs in the field of algorithms, the scalability of MARL remains limited in its application to real-world autonomous systems due to scalability constraints, communication assumptions, safety considerations, and theoretical guarantees of its success. The paper is a survey of MARL in autonomous systems in a critical and detailed manner, focusing on the algorithmic and application-driven perspectives. It is based on a systematic literature review methodology that is used to gather, filter, and group recent peer-reviewed works by MARL paradigms and autonomous system areas. MARL algorithms are divided into value-style, policy-style, and a hybrid style, and their advantages and disadvantages and implementation implications are comparatively discussed. The use in autonomous vehicles, UAV swarms, multi-robot systems, and industrial autonomous environments is discussed to underscore domain-specific limitations associated with the coordination, communication, and safety. The survey reveals through cross-domain synthesis that there are still persistent issues, such as scaling to large and heterogeneous sets of agents, the use of an idealized communication model, weak sim-to-real transfer, limited interpretability, and failure to achieve general convergence and safety guarantees. Lastly, the most important gaps in research are presented, and the future direction is proposed to make MARL a safer, more interpretable, and deployable autonomous system. This survey represents a structured and critical source of guidance to researchers and practitioners who wish to build robust MARL-enabled autonomy beyond evaluation processes within simulations.

Article activity feed