Multiscale Feature Optimization for Accurate Small Object Detection in Remote Sensing Imagery
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Detecting small, overlapping objects in high-resolution remote sensing imagery is crucial for applications such as smart city monitoring and disaster response. However, challenges such as severe feature confusion and spatial misalignment hinder accurate localization. This paper introduces Multiscale SOG-DETR, a systematic redesign of the RT-DETR framework tailored for remote sensing small-object detection. We propose a lightweight Multiscale Overlapping-Object Decoupling Network (MOODNet) to significantly reduce feature entanglement in overlapping regions. Additionally, our specialized fusion neck, comprising the Residual Spatial-Alignment Progressive Fusion Module (SAPFM), E-CGAFusion, and WTConv2d modules, enhances multiscale semantic focus and preserves high-frequency details cost-effectively. Experimental results on the RSOD, VisDrone2019, and NWPU VHR-10 datasets demonstrate that Multiscale SOG-DETR achieves superior detection accuracy with significantly fewer parameters compared to the baseline RT-DETR model, increasing AP_ IoU=50 by 3.1%, 3.0%, and 5.2%, and AP IoU=50:95 by 5.1%, 2.1%, and 8.5%, respectively. These findings position Multiscale SOG-DETR as a promising solution for efficient and accurate small-object detection in remote sensing applications.The source code is publicly available at https://github.com/AaronWang-code/Multiscale-SOG-DETR.