Developing a Tool for Storytelling Quality Assessment Using Acoustic Features

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Storytelling is a key part of early childhood development, especially when it's interactive and expressive. Research shows that dynamic reading styles help activate brain areas linked to attention, imagination, and language, but most current tools still rely on subjective observations rather than on objective, measurable features. Here, we introduce a tool that analyzes the quality of storytelling and provides a score based on three main aspects: how expressive the storyteller is (i.e. the level of voice engagement measured by the monotonous of the speech), how clear the speech is (i.e. pronunciation accuracy), and how natural the storytelling feels (in terms of speech rhythm and flow). One hundred and nineteen recordings were used, compiled by manually segmenting longer audio files into focused speech segments. These segments were scored by the project authors on three dimensions—Expressiveness, Clarity, and Naturalness—using a 1 to 5 scale. Acoustic features such as pitch variability, amplitude dynamics, formant dispersion, and various spectral descriptors were extracted using Python libraries including Librosa and pyAudioAnalysis. Two complementary approaches were applied. First, a GUI-based app was developed to extract and visualize features against labeled benchmarks. Second, a machine learning analysis was performed using Random Forest regression with Leave-One-Out Cross-Validation to explore predictive patterns and identify key acoustic indicators. Feature selection improved predictive performance significantly, with pitch-related features consistently emerging as the most informative. Results revealed that expressive storytelling was characterized by higher pitch and amplitude variability, and clearer articulation, while naturalness features showed weaker correlations. These findings support the feasibility of using automated acoustic analysis to evaluate storytelling quality.

Article activity feed