Improved Detection of Bird Vocalisations Using BirdNET Embeddings and Machine Learning
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Automated bird sound recognition has become an essential tool for biodiversity monitoring, enabling large-scale species detection from audio recordings. BirdNET is a well-known deep learning-based algorithm that has been trained using a large dataset of community labeled recordings and demonstrated strong performance in identifying bird species. When applied on a certain case such as a specific species or a geographical location, its performance can be leveraged through fine-tuning or incorporating a posterior classification step. In this study, the detection of the Eurasian Woodcock (Scolopax rusticola) is investigated. BirdNET embeddings are used as feature representations and classifiers are trained based on these features. A strongly labeled dataset is created by manually annotating 97 recent recordings (2023–2024) from Xeno-canto, extracting 501 positive segments and 2,505 negative segments. BirdNET was then evaluated on this dataset, achieving an average precision of 84.3 %. To enhance the detection accuracy, three machine learning classifiers are trained, i.e. Support Vector Machine (SVM), Random Forest, and XGBoost. The results demonstrate a significant improvement in classification performance, with overall average precision scores reaching the values of 99–100%, in comparison to the baseline performance. These results suggest that a hybrid deep learning and machine learning approach can substantially enhance bird species recognition, particularly for challenging acoustic environments. Hence, the present work contributes to advancing bioacoustic classification methodologies by demonstrating how deep learning embeddings can be effectively leveraged with traditional classifiers and strongly labeled data.