Bird Sound Feature Extraction and Recognition Model Design Based on Neural Network Architecture Search

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Birds are of great significance to biodiversity. Bird sounds have obvious characteristics per species, and they are an important way forbids to communicate and transmit information. Accurate identification of bird species plays a critical role in biodiversity surveys. The development of deep learning has made it possible to identify bird species through acoustic characteristics. However, most of the deep learning-based research on bird sound recognition relies too much on traditional experience and subjective human-designed feature extraction and model design, regardless of whether they are optimal for the bird sound data. Therefore, this study proposes a deep learning model based on neural architecture search, which uses a simple and equivalent node to design the model structure for feature extraction and recognition and takes the inferring time as part of the loss function to balance classification accuracy and running time, reducing model complexity. During the feature extraction phase, the final model selects 36 as the optimal number of Mel filter banks. On a dataset of 264 bird sounds, the neural network architecture search-based model achieves an average classification accuracy of 92.44% and a maximum classification accuracy of 98.14%. This study provides a meaningful exploration for achieving bird sound recognition without relying on traditional experience and subjective human design.

Article activity feed