Investigating Training Efficiency of Direct Scaling in Multi-Agent Reinforcement Learning

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

As multi-agent systems become increasingly central to domains like robotics, autonomous coordination, and distributed control, training strategies that reduce cost while maintaining effectiveness are essential. This paper explores whether training a smaller team of agents and then scaling up can offer a more efficient path to high-performing policies in Multi-agent reinforcement learning. Inspired by prior work, particularly Smit et al. (2023), we analyze whether pretraining smaller agent groups can improve training efficiency without sacrificing final performance. We introduce an agent-steps metric, which provides a standardized measure of total training effort across different agent counts. Experiments conducted in the Waterworld, Multiwalker, and Level-based Foraging environments reveal that the effectiveness of this approach appears to be inversely related to the diversity required among agents in the final team. When tasks allow agents to adopt similar roles, pretraining on smaller groups accelerates learning; however, in environments where agents must specialize into distinct roles, the benefits of early training are diminished. These findings inform future work in curriculum learning and scalable Heterogeneous-agent reinforcement learning (HARL).

Article activity feed