RoboNavGuard: Lightweight Deformable Obstacle Segmentation and 3D Visual Grounding for Indoor Robot Navigation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Navigating obstacles and understanding complex environments remain critical challenges in the deployment of automated guided vehicles (AGVs), a key area in mobile robotics. Existing AGVs often struggle with inadequate recognition of specific objects, such as plastic bags and feces, which may cause issues such as roller entanglements or contamination. Furthermore, vision-language navigation discrepancies can lead to positioning errors. In this paper, we propose a deep learning-based solution (called `` RoboNavGuard '') specifically designed to address these issues and improve the practical deployment of AGVs. The proposed RoboNavGuard method builds on the PyraBiNet++ architecture for precise obstacle segmentation. The RoboNavGuard is composed of the following three key innovations: (a) a novel architecture that fuses local and global features for obstacle segmentation, (b) a practical and comprehensive dataset that establishes a new benchmark for 3D visual grounding, and (c) a method that tightly integrates textual commands into AGVs’ visual perception, thereby enabling more natural and efficient human-robot interaction.We also introduce a new dataset, called `` FragCloud3DRef++ '', for training our `` Re_3DVG-Small '' model, a lightweight 3D visual grounding model designed to enhance fragmented point cloud understanding and improve language–vision navigation.Our system is being deployed in live AGV operations to validate its real-world applicability. The code source and dataset are open at {\bfhttps://github.com/zehantan6970/RoboNavGuard}.