A Multi-Scale Feature Fusion Hybrid Convolution Attention Model for Birdsong Recognition

Wei Li
Danju Lv
Yueyun Yu
Yan Zhang
Lianglian Gu
Ziqian Wang
Zhicheng Zhu

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Birdsong is a valuable indicator of rich biodiversity and ecological significance. Although feature extraction has demonstrated satisfactory performance in classification, single-scale feature extraction methods may not fully capture the complexity of birdsong, potentially leading to suboptimal classification outcomes. The integration of multi-scale feature extraction and fusion enables the model to better handle scale variations, thereby enhancing its adaptability across different scales. To address this issue, we propose a Multi-Scale Hybird Convolutional Attention Mechanism Model (MUSCA). This method combines depth wise separable convolution and traditional convolution for feature extraction and incorporates self-attention and spatial attention mechanisms to refine spatial and channel features, thereby improving the effectiveness of multi-scale feature extraction. To further enhance multi-scale feature fusion, we have developed a layer-by-layer aligned feature fusion method that establishes deeper correlations, thereby improving classification accuracy and robustness. In our study, we investigated the songs of 20 bird species, extracting wavelet spectrogram, log-Mel spectrogram and log-spectrogram features. The classification accuracies achieved by our proposed method were 93.79%, 96.97% and 95.44% for these respective features. The results indicate that the birdcall recognition method proposed in this paper outperforms recent and state-of-the-art methods.

Version published to 10.21203/rs.3.rs-4976065/v1 on Research Square
Oct 4, 2024

Research on Multi-Scale Spatio-Temporal Graph Convolutional Human Behavior Recognition Method Incorporating Multi-Granularity Features

This article has 4 authors:
1. Yulin Wang
2. Tao Song
3. Yichen Yang
4. Zheng Hong
This article has no evaluationsLatest version Oct 15, 2024
FasterMLP: Multilayer Perceptron-based Attention Mechanism and Wavelet Sampling Fusion Networks

This article has 4 authors:
1. ChenHao Ma
2. Yong Cao
3. Xueyuan Liu
4. Jian Rong
This article has no evaluationsLatest version Sep 30, 2024
FocusNet: Multi-Scale Parallel CNN Model with an Attention-Based Feature-Fusion Technique

This article has 3 authors:
1. Akash Mehta
2. Rahul Paul
3. Sanjit Kumar Setua
This article has no evaluationsLatest version Nov 4, 2024

Listed in

Abstract

Article activity feed

Related articles

Research on Multi-Scale Spatio-Temporal Graph Convolutional Human Behavior Recognition Method Incorporating Multi-Granularity Features

FasterMLP: Multilayer Perceptron-based Attention Mechanism and Wavelet Sampling Fusion Networks

FocusNet: Multi-Scale Parallel CNN Model with an Attention-Based Feature-Fusion Technique