Segmental cues for self-other and familiar-other voice distinction
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Our own voice is essential for communication and identity, yet we often struggle to recognize it in recordings. This is largely due to bone conduction, which alters its spectral balance in ways we only perceive while speaking. However, because bone-conducted energy is stronger for segments with greater oral closure, like /u/, listeners may more easily recognize self-voice in these sounds, even in air-conducted recordings. We investigated whether short phonemes convey different perceptual and acoustic cues to self- and other-voice recognition. Twenty-six participants judged self- and familiar-voice identity in stimuli created by morphing their recorded phonemes (/a/, /u/, /ʃ/, /s/) with unfamiliar voices. Performance varied systematically by phoneme: vowels yielded higher recognition accuracy than sibilants, with /a/ yielding the strongest performance and /s/ the weakest. Although sibilants supported above-chance speaker discrimination, participants often confused voices. No phoneme showed a self-voice-specific advantage, suggesting that speaking-related bone-conducted cues do not aid self-voice recognition in air-conducted input. Acoustic parameters (e.g., fundamental frequency for vowels) showed weak links to performance, suggesting non-acoustic cognitive mechanisms (e.g., memory template matching) aided recognition. Overall, our findings show that even brief segmental cues carry identity information, underscoring the complex interplay of acoustic detail and cognitive processing in voice recognition.