Hearing in categories aids speech streaming at the “cocktail party”

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Our perceptual system bins elements of the speech signal into categories to make speech perception manageable. Here, we aimed to test whether hearing speech in categories (as opposed to a continuous/gradient fashion) affords yet another benefit to speech recognition: parsing noisy speech at the “cocktail party.” We measured speech recognition in a simulated 3D cocktail party environment. We manipulated task difficulty by varying the number of additional maskers presented at other spatial locations in the horizontal soundfield (1-4 talkers) and via forward vs. time-reversed maskers, promoting more and less informational masking (IM), respectively. In separate tasks, we measured isolated phoneme categorization using two-alternative forced choice (2AFC) and visual analog scaling (VAS) tasks designed to promote more/less categorical hearing and thus test putative links between categorization and real-world speech-in-noise skills. We first show that listeners can only monitor up to ∼3 talkers despite up to 5 in the soundscape and streaming is not related to extended high-frequency hearing thresholds (though QuickSIN scores are). We then confirm speech streaming accuracy and speed decline with additional competing talkers and amidst forward compared to reverse maskers with added IM. Dividing listeners into “discrete” vs. “continuous” categorizers based on their VAS labeling (i.e., whether responses were binary or continuous judgments), we then show the degree of IM experienced at the cocktail party is predicted by their degree of categoricity in phoneme labeling; more discrete listeners are less susceptible to IM than their gradient responding peers. Our results establish a link between speech categorization skills and cocktail party processing, with a categorical (rather than gradient) listening strategy benefiting degraded speech perception. These findings imply figure-ground deficits common in many disorders might arise through a surprisingly simple mechanism: a failure to properly bin sounds into categories.

Article activity feed