DMSCA: Dynamic Multi-Scale Channel-Spatial Attention for Enhanced Feature Representation in Convolutional Neural Networks
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The attention mechanism improves convolutional neural networks (CNNs) by emphasizing important features, yet current approaches often lack in capturing multi-scale contexts, deeply integrating channel and spatial information, and adapting dynamically. We introduce the Dynamic Multi-Scale Channel-Spatial Attention (DMSCA) mechanism, which synergistically combines multi-scale encoding, directional interactions, and adaptive activations for enhanced feature coupling. Unlike fixed-structure methods like CBAM, DMSCA employs learnable dynamic weights for adaptive channel-spatial fusion. Key innovations include Temperature-controlled Channel Attention (TCA) and Direction-aware Multi-scale Spatial Context Encoder (MSCE). DMSCA integrates six components: Global Context Encoder, TCA, MSCE, Directional Information Interaction, Dynamic Feature Fusion, and Adaptive Activation. Evaluations on CIFAR-10/100 and ImageNet demonstrate superior performance over state-of-the-art attentions, with a 1.52% Top-1 accuracy gain on ResNet-50 at modest computational cost (11.3% parameter and 2.4% FLOPs increase).