Deep reinforcement learning based coverage path planning in unknown environments
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm offers a robust solution for the coverage path planning problem, where a robot must effectively and efficiently cover a designated area, ensuring minimal redundancy and maximum coverage. Traditional methods for path planning often lack the adaptability required for dynamic and unstructured environments. In contrast, TD3 utilizes twin Q-networks to reduce overestimation bias, delayed policy updates for increased stability, and target policy smoothing to maintain smooth transitions in the robot's path. These features allow the robot to learn an optimal path strategy in real-time, effectively balancing exploration and exploitation. This paper explores the application of TD3 to coverage path planning, demonstrating that it enables a robot to adaptively and efficiently navigate complex coverage tasks, showing significant advantages over conventional methods in terms of coverage rate, total length, and adaptability.