Robust spatial hearing beyond primary interaural cues in humans over deep neural networks
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Spatial hearing allows humans to localize sound sources in the azimuth plane using interaural time (ITD) and level (ILD) differences, but the contribution of additional auditory features remains unclear. To investigate this, we measured human localization performance with natural and artificial stimuli that selectively included or excluded ITD and ILD as primary interaural cues. As expected, human listeners relied synergistically on ITD and ILD for accurate azimuth localization. Moreover, even when both primary cues were absent, localization performance remained above chance level. We compared human performance with state-of-the-art deep neural networks (DNN) optimized for sound localization to investigate possible computational mechanisms underlying this robust performance. In contrast to humans, DNNs demonstrated high accuracy only for stimuli that resembled their training regime but failed when primary interaural cues were absent. This human-DNN misalignment highlights a fundamental distinction in sensory processing strategies, potentially arising from the simplicity bias inherent in DNN training, with human reliance on a wider range of auditory features likely reflecting evolutionary pressures favoring adaptability across diverse acoustic environments. Together, our results demonstrate the robustness of human spatial hearing beyond primary interaural cues and point to promising directions for advancing artificial systems and informing clinical applications, such as cochlear implants and auditory prosthetics.