Visual Target-Driven Robot Crowd Navigation with Limited FOV Using Self-Attention Enhanced Deep Reinforcement Learning
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Navigating crowded environments poses significant challenges for mobile robots, particularly as traditional Simultaneous Localization and Mapping (SLAM)-based methods often struggle with dynamic and unpredictable settings. This paper proposes a visual target-driven navigation method using self-attention enhanced deep reinforcement learning (DRL) to overcome these limitations. The navigation policy is developed based on the Twin-Delayed Deep Deterministic Policy Gradient (TD3) algorithm, enabling efficient obstacle avoidance and target pursuit. We utilize a single RGB-D camera with a limited field of view (FOV) for target detection and surrounding sensing, where environmental features are extracted from depth data via a convolutional neural network (CNN). A self-attention network (SAN) is employed to compensate for the limited FOV, enhancing the robot’s capability of searching for the target when it is temporarily lost. Experimental results show that our method achieves a higher success rate and shorter average target-reaching time in dynamic environments, while offering hardware simplicity, cost-effectiveness, and ease of deployment in real-world applications.