Enhancing Learning of Collective Transport with Global State Prediction under Local, Bandwidth-Limited Communication Constraints

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

We study how multi-agent deep reinforcement learning can address complex collective transport problems. Most approaches to collective transport assume that the payload has known shape and mass distribution. Without these assumptions, finding closed-form analytic solutions for distributed robot control becomes challenging. Deep reinforcement learning (DRL) offers a potential solution. However, applying DRL to multi-robot systems presents challenges, particularly due to non-stationarity. Traditional methods to address non-stationarity involve centralized policies or message passing which limit scalability. We introduce Global State Prediction}(GSP), a network that predicts the global state of the swarm using limited information. Each agent uses communicated information from neighbors to build a local prediction of the joint intentions of the swarm. We provide a theoretical analysis establishing GSP's foundations within frameworks of partially observable stochastic games, mean-field theory, and graphical games, showing how our approach can mitigate non-stationarity while enhancing coordination among agents. Through extensive experiments in both simulation and real-world environments, we show that GSP outperforms baselines across diverse scenarios. Notably, policies trained with one-hop communication (GSP-N) scale better than those trained with global communication, even with unknown payload characteristics, and reduce bandwidth (from O(n^2) to O(1)), offering a scalable path toward practical swarm deployment in unstructured environments.

Article activity feed