FAMHE-Net: Multi-Scale Feature Augmentation and Mixture of Heterogeneous Experts for Oriented Object Detection
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Object detection in remote sensing images is essential for applications like unmanned aerial vehicle (UAV)-assisted agricultural surveys and aerial traffic analysis, facing unique challenges such as low resolution, complex backgrounds, and the variability of object scales. Current detectors struggle with integrating spatial and semantic information effectively across scales and often omit necessary refinement modules to focus on salient features. Furthermore, a detector head that lacks a meticulous design may face limitations in fully understanding and accurately predicting based on the enriched feature representations. These deficiencies can lead to insufficient feature representation and reduced detection accuracy. To address these challenges, this paper introduces a novel deep-learning framework, FAMHE-Net, for enhancing object detection in remote sensing images. Our framework features a consolidated multi-scale feature enhancement module (CMFEM) with integrated Path Aggregation Feature Pyramid Network (PAFPN), utilizing our efficient atrous channel attention (EACA) within CMFEM for enhanced contextual and semantic information refinement. Additionally, we introduce a sparsely gated mixture of heterogeneous expert heads (MOHEH) to adaptively aggregate detector head outputs. Compared to the baseline model, FAMEH-Net demonstrates significant improvements, achieving a 0.90% increase in mean Average Precision (mAP) of the DOTA dataset and a 1.30% increase in mAP12 of HRSC2016 datasets. These results highlight the effectiveness of FAMEH-Net in object detection within complex remote sensing images.