CSM-DETR: Construction Site Monitoring via Mamba-Enhanced Detection Transformer for UAV Aerial Imagery
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Unmanned Aerial Vehicles (UAVs) offer significant advantages for construction site monitoring through flexible deployment and high-resolution imagery. However, existing vision-based detection methods face significant challenges including extreme scale variations, dense object distributions, complex backgrounds, and real-time processing constraints. To address these limitations, we propose CSM-DETR, a novel detection transformer specifically designed for UAV-based construction monitoring. Our framework adopts the MobileMamba as backbone to achieve linear computational complexity $\mathcal{O}(n)$ while capturing long-range spatial dependencies, and incorporates the Hierarchical Local-Aware Fusion (HLAF) mechanism for adaptive multi-scale feature aggregation. Furthermore, we propose three key innovations: (1) a Dual-Attention Spatial Integration (DASI) module enhancing multi-scale spatial feature representation through parallel local and global attention streams; (2) a Cross-Scale Deformable Fusion (CSDF) module enabling flexible cross-scale feature interaction through deformable sampling; and (3) a Scale-Aware Composite Loss (SAC Loss) providing scale-aware supervision for challenging small objects. We construct a comprehensive benchmark dataset named UAV-CSM47, containing 15,860 high-resolution aerial images with 47 construction-related object categories. Extensive experiments demonstrate that CSM-DETR achieves state-of-the-art performance with 91.8\% mAP@0.5 and 73.6\% mAP@0.5:0.95, outperforming YOLOv13-L by 3.3 percentage points and Co-DETR by 2.7 percentage points while maintaining real-time inference at 38 FPS. Ablation studies validate each component's effectiveness, and cross-domain evaluation confirms strong generalization capability. The proposed system provides a practical solution for automated construction site monitoring with broad applications in safety supervision, progress tracking, and resource management.