Enhancing Learning of Collective Transport with Global State Prediction under Local, Bandwidth-Limited Communication Constraints
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We study how multi-agent deep reinforcement learning can address complex collective transport problems. Most approaches to collective transport assume that the payload has known shape and mass distribution. Without these assumptions, finding closed-form analytic solutions for distributed robot control becomes challenging. Deep reinforcement learning (DRL) offers a potential solution. However, applying DRL to multi-robot systems presents challenges, particularly due to non-stationarity. Traditional methods to address non-stationarity involve centralized policies or message passing which limit scalability. We introduce Global State Prediction}(GSP), a network that predicts the global state of the swarm using limited information. Each agent uses communicated information from neighbors to build a local prediction of the joint intentions of the swarm. We provide a theoretical analysis establishing GSP's foundations within frameworks of partially observable stochastic games, mean-field theory, and graphical games, showing how our approach can mitigate non-stationarity while enhancing coordination among agents. Through extensive experiments in both simulation and real-world environments, we show that GSP outperforms baselines across diverse scenarios. Notably, policies trained with one-hop communication (GSP-N) scale better than those trained with global communication, even with unknown payload characteristics, and reduce bandwidth (from O(n^2) to O(1)), offering a scalable path toward practical swarm deployment in unstructured environments.