Breaking the Single-Mosquito Barrier: Synthetic Swarm Data for Acoustic Vector Monitoring

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Mosquito-borne diseases pose a significant global health threat, causing more than 700,000 deaths annually and placing increasing pressure on vector surveillance programs. This work presents a proof-of-concept Synthetic Swarm Mosquito Dataset designed for acoustic classification under realistic multi-species and noisy swarm conditions. In contrast to conventional datasets, whose construction requires labor-intensive recording of individual mosquitoes, our synthetic approach enables scalable, reproducible data generation without extensive fieldwork. The core novelty of this study lies in the controlled synthesis of swarm audio, capturing overlapping wingbeat patterns, variable densities, and heterogeneous noise profiles that are difficult to obtain from natural environments. Using log-mel spectrogram representations, we evaluate several lightweight deep learning architectures and demonstrate that these models can reliably classify six major mosquito vectors, even under swarm-induced acoustic interference. Furthermore, the proposed method was validated using real-world mosquito audio collected from a CO2-baited trap in Vung Tau, Vietnam. This field evaluation demonstrates that our synthetic-trained model generalizes effectively to practical deployment conditions, confirming the real-world feasibility of swarm-capable acoustic mosquito surveillance. The public dataset used in this study is available online.

Article activity feed