A model-based analysis of modulation masking effects on vocoded speech intelligibility

Cathrina Veigel
Helia Relaño-Iborra
Andrew Oxenham
Torsten Dau

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Vocoded speech is widely used to simulate aspects of cochlear implant (CI) listening, and modeling vocoded intelligibility in normal-hearing (NH) listeners can provide insight into auditory processing under electric hearing. Sentence recognition was measured using three masker types - speech-shaped noise, tone complex, and noise-modulated tone complex - and four processing conditions: unprocessed, bandpass-filtered, tone-vocoded, and tone-vocoded with simulated current spread. Masker type affected intelligibility in all conditions except vocoded speech with simulated current spread. A modified speech‑based envelope power spectrum model with a correlation-based decision stage (sEPSMcorr) was fitted to the unprocessed speech‑shaped-noise condition and then applied, without re‑fitting, to the remaining conditions. The model captured masker‑dependent intelligibility patterns for unprocessed, bandpass‑filtered, and tone‑vocoded speech without simulated current spread. Analyses of the model's internal representations suggested that vocoding‑related intelligibility loss was driven mainly by the removal of higher‑rate temporal-envelope modulations rather than by reduced spectral resolution. With simulated current spread, the model predicted the loss of benefit for unmodulated maskers, but consistently overestimated intelligibility. These findings support the view that modulation masking is not the primary limitation in simulated electric hearing and point to envelope‑energy and pitch‑related distortions as important targets for future CI‑oriented intelligibility models.

Version published to 10.31234/osf.io/3h9pv_v1 on OSF Preprints
Mar 28, 2026

Effect of Edge-Based Acoustic Modifications on Speech Intelligibility and SNR Thresholds in Classrooms: A Field Study

This article has 1 author:
1. Sebastian Kümmritz
This article has no evaluationsLatest version Mar 31, 2026
Modeling the Influence of Bandwidth and Envelope on Categorical Loudness Scaling

This article has 5 authors:
1. Stephen T. Neely
2. Sara E. Harris
3. Joshua J. Hajicek
4. Erik A. Petersen
5. Yi Shen
This article has no evaluationsLatest version Apr 1, 2026
Modulation statistics allow robust prediction of speech recognition accuracy across many words, voices, and natural background sounds

This article has 3 authors:
1. Alex C. Clonan
2. Ian H. Stevenson
3. Monty A. Escabí
This article has no evaluationsLatest version Apr 30, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Effect of Edge-Based Acoustic Modifications on Speech Intelligibility and SNR Thresholds in Classrooms: A Field Study

Modeling the Influence of Bandwidth and Envelope on Categorical Loudness Scaling

Modulation statistics allow robust prediction of speech recognition accuracy across many words, voices, and natural background sounds