Spectral Pyramid Pooling and Fused Keypoint Generation in ResNet-50 for Robust 3D Object Detection
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Accurate and robust 3D object detection under various environments is a challenging task and 3D object detection mainly relies on the direction, position and size of the objects. The traditional object detection approaches are affected based on diverse issues such as background clutter, camera view-point variations and occlusions. To avoid these issues in the field of object detection, one of the novel approaches named as Faster Region-based Convolutional Neural ResNet-50 (FRCNResNet-50) model is proposed that detects and classifies the 3D objects in the images. At first, the images are collected from three different data sources such as KITTI dataset, a nuScene dataset and an MIT Indoor Scene dataset. Then these images are preprocessed to enhance the image quality and generalization ability of the proposed model. The ResNet-50 model is designed to extract features by using a Spectral Pyramid Pooling (SPP) layer and a Fused Keypoint Generation (FKG) layer that enhances detection efficiency and reduces computational cost. The FRCNN model is implemented to detect 3D objects that include the ROI pooling layer for multi-class classification and for presenting its corresponding regression bounding box. The experimental validation is performed based on the significant measurements and quantitative analyses that showed the proposed model achieved better performances of 98.58% from accuracy analysis and 98 ms from computational time.