A Robust Framework Fusing Visual SLAM and 3D Gaussian Splatting with a Coarse-Fine Method for Dynamic Region Segmentation
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Existing visual SLAM systems with neural representations excel in static scenes but fail in dynamic environments where moving objects degrade performance. To address this, we propose a robust dynamic SLAM framework combining classic geometric features for localization with learned photometric features for dense mapping. Our method first tracks objects using instance segmentation and a Kalman filter. We then introduce a cascaded, coarse-to-fine strategy for efficient motion analysis: a lightweight sparse optical flow method performs a coarse screening, while a fine-grained dense optical flow clustering is selectively invoked for ambiguous targets. By filtering features on dynamic regions, our system drastically improves camera pose estimation, reducing Absolute Trajectory Error by up to 95% on dynamic TUM RGB-D sequences compared to ORB-SLAM3, and generates clean dense maps. The 3D Gaussian Splatting backend, optimized with a Gaussian pyramid strategy, ensures high-quality reconstruction. Validations on diverse datasets confirm our system’s robustness, achieving accurate localization and high-fidelity mapping in dynamic scenarios while reducing motion analysis computation by 91.7% over a dense-only approach.