MECA-DETR:Multi-scale Edge-aware Contextual Attention DETR for UAV-based Small Object Detection
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Small object detection in UAV aerial imagery remains a persistent challenge due to the need for precise edge preservation, effective multi-scale feature fusion, and maintaining stable, efficient training. To tackle these issues, we propose MECA-DETR, a DETR-based end-to-end detection framework tailored for UAV-based small object detection. MECA-DETR comprises three key components:(1)Multi-Scale Edge Fusion (MSEF) enhances structural representation by integrating multi-scale context with high-frequency edge features; (2)Cross-Scale Attention Fusion (CSAF) leverages a novel attention mechanism that jointly captures local details and global context to align semantics across scales; (3)Adaptive Intermediate Fusion with DynamicTanh (AIFI_DyT) employs a dynamic channel-wise activation in place of LayerNorm, stabilizing training and accelerating convergence without additional computational cost. Extensive experiments on VisDrone2019 and DOTA datasets validate the effectiveness of our approach. On the VisDrone2019 dataset, MECA-DETR achieves 28.7% AP and 40.1% AP50, representing relative improvements of approximately 7.5% and 10.5% over RT-DETR-R18, respectively. On DOTA, MECA-DETR attains 33.55% mAP₀.₅:₀.₉₅, outperforming RT-DETR-R50 by 1.81%. These results highlight the effectiveness of our edge-aware multi-scale design and dynamic activation strategy for efficient small object detection in UAV imagery.