Vib2Sound: Separation of Multimodal Sound Sources

Mai Akahoshi
Yuhang Wang
Longbiao Cheng
Anja T. Zai
Richard H. R. Hahnloser

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Understanding animal social behaviors, including vocal communication, requires longitudinal observation of interacting individuals. However, isolating individual-level vocalizations in complex environments is challenging due to background noise and frequent overlaps of coincident signals from multiple vocalizers. A promising solution lies in multimodal recordings that combine traditional microphones with animal-borne sensors, such as accelerometers and directional microphones. These sensors, however, are constrained by strict limits on weight, size, and power consumption and often lead to noisy or unstable signals. In this work, we introduce a neural network-based system for sound source separation which leverages multi-channel microphone recordings and body-mounted accelerometer signals. Using a dataset of zebra finches recorded in a social setting, we demonstrate that contact sensing largely outperforms conventional microphone-array recordings. By enabling the separation of overlapping vocalizations, our approach offers a valuable tool for studying animal communication in complex naturalistic environments.

Version published to 10.1101/2025.05.08.652866 on bioRxiv
May 14, 2025

Shared acoustic manifolds for exploratory comparison of passerine vocalizations

This article has 1 author:
1. Lucio Arese
This article has no evaluationsLatest version Jan 23, 2026
Environmental Sound Classification Using Feature Fusion of MFCCs, Mel-spectrogram, and Chroma

This article has 1 author:
1. Mainul Islam
This article has no evaluationsLatest version Jan 16, 2026
Reverse-Engineering Speech and Music Categorization from a Single Sound Source

This article has 5 authors:
1. Lauren K Fink
2. Madita Hörster
3. David Poeppel
4. Melanie Wald-Fuhrmann
5. Pauline Larrouy-Maestri
This article has no evaluationsLatest version Jan 25, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Shared acoustic manifolds for exploratory comparison of passerine vocalizations

Environmental Sound Classification Using Feature Fusion of MFCCs, Mel-spectrogram, and Chroma

Reverse-Engineering Speech and Music Categorization from a Single Sound Source