Shorter FFT Windows Improve Cross-Domain Generalization in CNN-Based Cetacean Whistle Detection: A Controlled Sensitivity Analysis

Rocco De Marco

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

In spectrogram-based Convolutional Neural Network (CNN) detectors for Passive Acoustic Monitoring (PAM), the FFT window length directly governs the spectro-temporal representation presented to the classifier, yet its effect on detection performance has received limited systematic treatment. This study presents a controlled sensitivity analysis of FFT window length (256, 512, and 1024 samples) on binary bottlenose dolphin ( Tursiops truncatus ) whistle detection, evaluated through stratified 10-fold cross-validation on an in-domain dataset (192 kHz) and an independent cross-domain benchmark. In-domain performance is uniformly high across all configurations (macro F1-score ≈ 0.98; Wilcoxon, all p > 0.05). Cross-domain results diverge substantially: the shortest window is significantly superior ( p = 0.006, rankbiserial r = 0.89). The mechanism is an upsampling amplification effect: coarser spectral bins produce wider, higher-contrast frequency-modulated traces after resampling to fixed image dimensions. This superiority is threshold-invariant: precision equals 1.000 across all tested configurations and decision thresholds. A multiclass extension to five vocalization categories (macro F1-score = 0.843) confirms the framework’s scalability. All experiments were conducted within a six-stage open-source pipeline fully parameterized through a single configuration file, ensuring exact reproducibility. Software source code and both datasets are publicly available.

Version published to 10.64898/2026.05.01.721665 on bioRxiv
May 6, 2026

Application of deep learning to estimate blue and fin whale call density in the southern California Current Ecosystem

This article has 13 authors:
1. Michaela N. Alksne
2. Marie A. Roch
3. Kaitlin E. Frasier
4. John A. Hildebrand
5. Shane Andres
6. Dolapo Adesanya
7. Lauren M. Baggett
8. Joshua M. Jones
9. Xiaobai Liu
10. Ana Širović
11. Xixin Zhang
12. Joshua Zingale
13. Simone Baumann-Pickering
This article has no evaluationsLatest version Apr 9, 2026
SGL-CNN: A Dual-Domain Convolutional Neural Network harnessing Spatial and Frequency Features for Bathymetry Estimation

This article has 4 authors:
1. Luting Hua
2. Chao Wang
3. Jieru Zhan
4. Xiaohui Liu
This article has no evaluationsLatest version Apr 6, 2026
Tuning into the city soundscape: Optimizing Convolutional Neural Networks for avian acoustic identification in the neotropics and evaluating their performance against established monitoring approaches

This article has 3 authors:
1. Melissa Ardila-Villamizar
2. Laurence H. DeClippele
3. Davide M. Dominoni
This article has no evaluationsLatest version May 13, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Application of deep learning to estimate blue and fin whale call density in the southern California Current Ecosystem

SGL-CNN: A Dual-Domain Convolutional Neural Network harnessing Spatial and Frequency Features for Bathymetry Estimation

Tuning into the city soundscape: Optimizing Convolutional Neural Networks for avian acoustic identification in the neotropics and evaluating their performance against established monitoring approaches