An automated pipeline for efficiently generating standardized, child-friendly audiovisual stimuli

Bianca Santi
Matthew Soza
Greta Tuckute
Aalok Sathe
Evelina Fedorenko
Halie Olson

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Creating engaging, well-controlled neuroimaging tasks for children can be difficult and time-consuming. To simplify and accelerate the process, we developed an automated pipeline that combines existing audio generation and animation tools to generate customizable audiovisual stimuli from text input, such as for studies of language comprehension. The pipeline consists of two components: the first generates auditory stimuli from text using Google Cloud Text-to-Speech, and the second uses Adobe Character Animator to create video stimuli in which an animated character says the stimuli out loud. We evaluated the pipeline with two stimulus sets, including comparing generated audio stimuli to existing human-recorded stimuli. The pipeline is efficient, taking less than 2 minutes to generate each audiovisual stimulus, and less than 9% of stimuli needed to be regenerated. The audio generation component is particularly fast, taking less than 1 second per stimulus, and the resulting stimuli are less variable in pitch and some measures of intensity than human-recorded audio stimuli. This pipeline demonstrates the potential of leveraging automated tools for stimuli development, especially for stimuli that are time-consuming to create manually and for designs that require large quantities of well-controlled stimuli.

Version published to 10.31234/osf.io/8gcn7_v1 on OSF Preprints
Apr 2, 2025

From creating to translating: Chinese students’ audio description processes for film

This article has 2 authors:
1. Weiqing Xiao
2. Ying Li
This article has no evaluationsLatest version Apr 9, 2026
Multimodal speech perception in noise after multisensory perceptual training in autistic children and adolescents

This article has 6 authors:
1. Laura Möde
2. Erfan Ghaneirad
3. Gregor R. Szycik
4. Hans Worthmann
5. Stefan Bleich
6. Anna Borgolte
This article has no evaluationsLatest version Mar 18, 2026
How well can automated speech processing score early elementary student verbal responses on language and literacy assessments?

This article has 3 authors:
1. Ashley Edwards
2. Nuria Gutiérrez
3. Yaacov Petscher
This article has no evaluationsLatest version Apr 7, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

From creating to translating: Chinese students’ audio description processes for film

Multimodal speech perception in noise after multisensory perceptual training in autistic children and adolescents

How well can automated speech processing score early elementary student verbal responses on language and literacy assessments?