A model-based analysis of modulation masking effects on vocoded speech intelligibility
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Vocoded speech is widely used to simulate aspects of cochlear implant (CI) listening, and modeling vocoded intelligibility in normal-hearing (NH) listeners can provide insight into auditory processing under electric hearing. Sentence recognition was measured using three masker types - speech-shaped noise, tone complex, and noise-modulated tone complex - and four processing conditions: unprocessed, bandpass-filtered, tone-vocoded, and tone-vocoded with simulated current spread. Masker type affected intelligibility in all conditions except vocoded speech with simulated current spread. A modified speech‑based envelope power spectrum model with a correlation-based decision stage (sEPSMcorr) was fitted to the unprocessed speech‑shaped-noise condition and then applied, without re‑fitting, to the remaining conditions. The model captured masker‑dependent intelligibility patterns for unprocessed, bandpass‑filtered, and tone‑vocoded speech without simulated current spread. Analyses of the model's internal representations suggested that vocoding‑related intelligibility loss was driven mainly by the removal of higher‑rate temporal-envelope modulations rather than by reduced spectral resolution. With simulated current spread, the model predicted the loss of benefit for unmodulated maskers, but consistently overestimated intelligibility. These findings support the view that modulation masking is not the primary limitation in simulated electric hearing and point to envelope‑energy and pitch‑related distortions as important targets for future CI‑oriented intelligibility models.