VoxelFSD: voxel-based fully sparse detector with sparse convolution for 3D object detection

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In recent years, convolutional neural networks (CNNs) in computer graphics have been transferred and applied to 3D object detection, achieving promising performance. However, challenges still exist in this area. Aiming at the problem of time-consuming dramatic increase of current voxel-based detectors in large-scale point cloud perception, this paper proposes a fully sparse detector, VoxelFSD, which is capable of real-time long-range perception. The model consists of three key components: (1) Parallel Convolutional Branches (PCB), which not only expands the model receptive field, but also effectively handles the impact of the loss of object center features on the results; (2) Sparse RPN head, which predicts the candidate boxes in a sparse manner rather than in a dense form, enabling the model to effectively handle long-range perception tasks; (3) ROI head with attention fusion module (AFM-ROI), which utilizes cross-attention to effectively fuse the extracted 3D backbone features and the compressed bev features in the second stage, further improving the model performance. Based on the above modules, we propose a single-stage lightweight detector, VoxelFSD-S, and a two-stage detector, VoxelFSD-T. Among them, VoxelFSD-S achieves a better performance than the previous voxel-based lightweight detectors, while VoxelFSD-T achieves a mAP of 81.50\% on the KITTI test set. The code and the result are available at \href{https://github.com/seu-zwd/VoxelFSD}{https://github.com/seu-zwd/VoxelFSD}

Article activity feed