Enhancing Learning of Collective Transport with Global State Prediction under Local, Bandwidth-Limited Communication Constraints

Joshua Bloom
Julian Poindexter
Carlo Pinciroli

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

We study how multi-agent deep reinforcement learning can address complex collective transport problems. Most approaches to collective transport assume that the payload has known shape and mass distribution. Without these assumptions, finding closed-form analytic solutions for distributed robot control becomes challenging. Deep reinforcement learning (DRL) offers a potential solution. However, applying DRL to multi-robot systems presents challenges, particularly due to non-stationarity. Traditional methods to address non-stationarity involve centralized policies or message passing which limit scalability. We introduce Global State Prediction}(GSP), a network that predicts the global state of the swarm using limited information. Each agent uses communicated information from neighbors to build a local prediction of the joint intentions of the swarm. We provide a theoretical analysis establishing GSP's foundations within frameworks of partially observable stochastic games, mean-field theory, and graphical games, showing how our approach can mitigate non-stationarity while enhancing coordination among agents. Through extensive experiments in both simulation and real-world environments, we show that GSP outperforms baselines across diverse scenarios. Notably, policies trained with one-hop communication (GSP-N) scale better than those trained with global communication, even with unknown payload characteristics, and reduce bandwidth (from O(n^2) to O(1)), offering a scalable path toward practical swarm deployment in unstructured environments.

Version published to 10.21203/rs.3.rs-6597379/v1 on Research Square
Jun 5, 2025

TSPPO: Transformer-Based Sequential Proximal Policy Optimization for Multi-Agent Systems

This article has 6 authors:
1. Tao YANG
2. Xinhao SHI
3. Cheng XU
4. Yulin YANG
5. Qinghan ZENG
6. Hongzhe LIU
This article has no evaluationsLatest version Jul 10, 2025
Fast k-connectivity Restoration in Multi-Robot Systems for Robust Communication Maintenance: Algorithmic and Learning-based Solutions

This article has 4 authors:
1. Guangyao Shi
2. Md Ishat-E-Rabban
3. Griffin Bonner
4. Pratap Tokekar
This article has no evaluationsLatest version Jun 13, 2025
Probabilistic Multi-Robot Planning with Temporal Tasks and Communication Constraints

This article has 3 authors:
1. Thales Costa Silva
2. Xi Yu
3. M. Ani Hsieh
This article has no evaluationsLatest version Jun 9, 2025

Listed in

Abstract

Article activity feed

Related articles

TSPPO: Transformer-Based Sequential Proximal Policy Optimization for Multi-Agent Systems

Fast k-connectivity Restoration in Multi-Robot Systems for Robust Communication Maintenance: Algorithmic and Learning-based Solutions

Probabilistic Multi-Robot Planning with Temporal Tasks and Communication Constraints