A standardized naturalistic audio stimuli database with unsupervised labeling
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Research in cognitive neuroscience has relied on simple, highly controlled stimuli due to the difficulty in developing standardized, ecologically valid stimulus sets. However, there is a consensus that using ecologically valid stimuli is imperative to generalize results beyond controlled laboratory settings. The current study introduces a naturalistic audio stimulus database, consisting of short, recognizable, and emotionally rated stimuli. To create such a database, the current study collected 291 audio files from a wide range of sources. 361 participants rated the audio clips on emotionality, arousal, and recognizability, and subsequently freely described the audios by typing what they believed the sound to be. The text responses of the participants were embedded and clustered using an unsupervised machine-learning algorithm to derive a participant-grounded organization of auditory object categories. The results indicate audio clips were easily recognizable, while emotionality and arousal ratings showed broad variability, making the database suitable for diverse experimental needs. Furthermore, the final database comprises 10 distinct semantic categories, providing a diverse set of auditory stimuli.