Adaptive Confidence-Weighted Policy Aggregation: A Novel Method for Federated Reinforcement Learning
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This paper proposes an innovative Federated Reinforcement Learning (FRL) approach called the Adaptive Confidence-Weighted Policy Aggregation method, or ACWPA in short. In light of incomplete information and heterogeneous knowledge, ACWPA was developed to combine strengths from multiple agents while canceling their weaknesses in multi-agent tasks. This method dynamically weights the contribution of agents’ policies to provide a global policy based on the agent’s performance and the relevance of their expertise when the information about state-action rewards is partially incomplete. Evaluated on a multi-agent path planning task, ACWPA demonstrates advanced convergence and generalization compared to standard FRL methods like FedAvg and FedProx. Outcomes show that ACWPA increases navigation efficiency by 20% and reduces collision charges by 35% throughout diverse environments, highlighting its capacity to boost collaborative knowledge in multi-agent systems with heterogeneous knowledge. Furthermore, implementing ACWPA on large language models (LLMs) yielded a 15% improvement, indicating that this method has potential applicability in different areas of artificial intelligence.