DMF-YOLO: Dynamic Multi-Scale Feature Fusion Network-Driven Small Target Detection in UAV Aerial Images

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Target detection in UAV aerial images has found increasingly widespread applications in emergency rescue, maritime monitoring, and environmental surveillance. However, traditional detection models suffer significant performance degradation due to challenges including substantial scale variations, high proportions of small targets, and dense occlusions in UAV-captured images. To address these issues, this paper proposes DMF-YOLO, a high-precision detection network based on YOLOv10 improvements. First, we design Dynamic Dilated Snake Convolution (DDSConv) to adaptively adjust the receptive field and dilation rate of convolution kernels, enhancing local feature extraction for small targets with weak textures. Second, we construct a Multi-scale Feature Aggregation Module (MFAM) that integrates dual-branch spatial attention mechanisms to achieve efficient cross-layer feature fusion, mitigating information conflicts between shallow details and deep semantics. Finally, we propose an Expanded Window-based Bounding Box Regression Loss Function (EW-BBRLF), which optimizes localization accuracy through dynamic auxiliary bounding boxes, effectively reducing missed detections of small targets. Experiments on the VisDrone2019 and HIT-UAV datasets demonstrate that DMF-YOLOv10 achieves 50.1% and 81.4% mAP50, respectively, significantly outperforming the baseline YOLOv10s by 27.1% and 2.6%, with parameter increases limited to 24.4% and 11.9%. The method exhibits superior robustness in dense scenarios, complex backgrounds, and long-range target detection. This approach provides an efficient solution for UAV real-time perception tasks and offers novel insights for multi-scale object detection algorithm design.

Article activity feed