Deep learning-based visual SLAM method for indoor dynamic scenes
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
In indoor dynamic scenes, traditional visual SLAM algorithms often suffer from significant degradation in localization accuracy due to interference from moving objects. This study proposes a deep learning-based visual SLAM method. The method is based on the ORB_SLAM3 framework and combines target detection with a dual dynamic feature removal mechanism based on extreme point geometry constraints to improve localization accuracy. First, the original convolutional layers in the YOLOv8m network are replaced with GhostConv modules, and the CBAM attention mechanism is added to the backbone network for dynamic object detection. Then, the extracted point features and line features are combined to preliminarily remove dynamic features, followed by further filtering and removal of residual dynamic features using polar geometry constraints. Finally, the P3P algorithm is used to perform pose-assisted localization based on the filtered static features. In validation on the TUM_RGB_D dataset, the improved algorithm achieves an average localization accuracy improvement of 53.10% compared to ORB_SLAM3, with a maximum of 93.56%. It demonstrates good accuracy and robustness in physical validation, providing new research insights for dynamic scene SLAM systems.