Data-driven Sampling Strategies for Fine-Tuning Bird Detection Models

Corentin Bernard
Ben McEwen
Benjamin Cretois
Hervé Glotin
Dan Stowell
Ricard Marxer

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Passive Acoustic Monitoring has emerged as a promising tool for collecting ecological data, particularly in the context of bird population monitoring. Bird species can be automatically identified using pre-trained models, such as BirdNET. The performance of these models can be significantly improved through fine-tuning with annotated samples recorded in the specific acoustic conditions in which the microphones are deployed. However, PAM collects vast amounts of data, and annotating bird vocalizations requires specialized expetise. As a result, only a very small portion of the recordings can be effectively labeled. Selecting the most relevant samples to annotate in order to maximize performance in model fine-tuning remains a significant challenge. First, a regularization technique addresses the challenge of class imbalance during model fine-tuning. Next, a data-driven methodology is developed, introducing the influence score , which quantifies the impact of individual training samples on model performance to inform sampling strategies. A linear model is proposed to estimate the influence score for generalization to unseen data. Finally, several sampling strategies are compared, based on acoustic indices and predictions of the pre-trained model. Together, these contributions enable the identification of efficient annotation strategies to overcome the challenges of limited annotation resources in large-scale passive acoustic monitoring.

Version published to 10.1101/2025.10.02.679964 on bioRxiv
Oct 4, 2025

Stratified Active Learning for Spatiotemporal Generalisation in Large-Scale Bioacoustic Monitoring

This article has 3 authors:
1. Ben McEwen
2. Corentin Bernard
3. Dan Stowell
This article has no evaluationsLatest version Sep 5, 2025
Estimating how Site-Level Differences in Acoustic Environments Affect Species Detection by Machine Learning Models

This article has 6 authors:
1. Ruari Marshall-Hawkes
2. Simon Gillings
3. Mark W. Wilson
4. Anthony S. Wetherhill
5. Lynn V. Dicks
6. Adham Ashton-Butt
This article has no evaluationsLatest version Sep 4, 2025
A global assessment of BirdNET performance: differences among continents, biomes, and species

This article has 100 authors:
1. David Funosas
2. Esther Sebastián-González
3. Jon Morant
4. Oscar H. Marín Gómez
5. Irene Mendoza
6. Miguel A. Mohedano-Muñoz
7. Eduardo Santamaría
8. Giulia Bastianelli
9. Alba Márquez-Rodríguez
10. Michał Budka
11. Gerard Bota
12. José M. De la Peña-Rubio
13. Eladio García de la Morena
14. Manu Santa-Cruz
15. Pablo de la Nava
16. Mario Fernández-Tizón
17. Hugo Sánchez.Mateos
18. Adrián Barrero
19. Juan Traba
20. Tomasz S. Osiejuk
21. Cristina D. Alonso-Moya
22. Patrick J. Hart
23. Amanda K. Navine
24. Andrés F. Montoya Muñoz
25. Carlos B. de Araujo
26. Gabriel L. M. Rosa
27. Ingrid M. D. Torres
28. Ana L. C. Catalano
29. Cassio Rachid Simões
30. Diego Llusia
31. Manuel B. Morales
32. Pablo Acebes
33. Juan A. Medina
34. Nicholas Brown
35. Christos Astaras
36. Ilias Karmiris
37. Elizabeth Navarrete
38. Maxime Cauchoix
39. Luc Barbaro
40. Dominik Arend
41. Sandra Müeller
42. Fernando González-García
43. Alberto González-Romero
44. Christos Mammides
45. Michaelangelo Pontikis
46. Giordano Jacuzzi
47. Julian D. Olden
48. Sara P. Bombaci
49. Gabriel Marcacci
50. Alain Jacot
51. Juan P. Zurano
52. Elena Gangenova
53. Diego Varela
54. Facundo Di Sallo
55. Gustavo A. Zurita
56. Andrey Atemasov
57. Junior A. Tremblay
58. Vincent Lamarre
59. Anja Hutschenreiter
60. Alan Monroy-Ojeda
61. Mauricio Díaz-Vallejo
62. Sergio Chaparro-Herrera
63. Robert A. Briers
64. Renata Sousa-Lima
65. Thiago Pinheiro
66. Wigna C. da Silva
67. Alice Calvente
68. Raiane V. Paz
69. Carlos Salustio-Gomes
70. Dorgival D. Oliveira-Júnior
71. Cicero S. Lima-Santos
72. Mauro Pichorim
73. Anamaria Dal Molin
74. Alexandre Antonelli
75. Svetlana Gogoleva
76. Igor Palko
77. Hiếu V. Trong
78. Marina H. L. Duarte
79. Natalia dos Santos Saturnino
80. Samuel R. Silva
81. Ana Rainho
82. Paula Lopes
83. Karl-L. Schuchmann
84. Marinêz I. Marques
85. Ana S. de Oliverira Tissiani
86. Nick A. Littlewood
87. Mao-Ning Tuanmu
88. Sebastian Kepfer-Rojas
89. Andrea L. Aguilera
90. Lluís Brotons
91. Mariano J. Feldman
92. Louis Imbeau
93. Pooja Panwar
94. Aaron S. Weed
95. Anant Dehwal
96. Alfredo Attisano
97. Jörn Theuerkauf
98. Eben Goodale
99. Kevin F.A. Darras
100. Cristian Pérez-Granados
This article has no evaluationsLatest version Oct 14, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Stratified Active Learning for Spatiotemporal Generalisation in Large-Scale Bioacoustic Monitoring

Estimating how Site-Level Differences in Acoustic Environments Affect Species Detection by Machine Learning Models

A global assessment of BirdNET performance: differences among continents, biomes, and species