PF-MSVNet: A 3D Object Detection Model With Multi-scale Point- level Feature Fusion

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

3D object detection aims to accurately determine the spatial location of objects and plays an important role in complex tasks such as autonomous driving and aircraft obstacle avoidance. However, 3D data lacks color and texture information, and traditional single-modality detection methods are prone to issues such as missed or false detections. To address this problem, we propose a 3D object detection model based on multi-scale point-level feature fusion (Point Fusion Mixed Sampling VoteNet, PF-MSVNet). First, we use image foreground information to divide the point cloud foreground and background point sets, and randomly downsample the foreground point sets. Then, we use residual connection to improve the point cloud feature extraction module to deepen the network layers and extract high-quality features. We construct a multi-scale point-level feature fusion network and introduce an attention mechanism to suppress image interference information, deepening the fusion of point cloud and image information at the feature level. Experimental results on outdoor KITTI dataset and indoor SUN RGB-D dataset shows that PF-MSVNet achieves a higher average detection accuracy of 8.61%, 5.86%, 3.65%, and 5.23% respectively compared to VoxelNet, SECOND, PointRCNN, and F-PointNet, reaching a maximum mAP of 67.17% among various models. The detection accuracy on small and difficult targets is significantly better than other models. This verifies that the proposed PF-MSVNet model can further improve the accuracy and robustness of object detection.

Article activity feed