Acoustic-Driven Generation of Pathological Speech Reports Using Large Language Models

Tomas Arias-Vergara
Lukas Buess
Nastassia Vysotskaya
Soroosh Tayebi Arasteh
Juan Rafael Orozco-Arroyave
Maria Schuster
Elmar Noeth
Andreas Maier
Paula Andrea Perez-Toro

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Clinical reports compile patients' histories, treatments, and outcomes, enabling the creation of personalized and effective treatment plans. However, speech disorders are rarely analyzed using such reports, primarily due to the absence of standardized speech protocols. Nevertheless, speech and language therapists (SLTs) can rely on perceptual evaluations, such as the modified Frenchay Dysarthria Assessment (mFDA) scale, to quantify the severity of symptoms across seven categories: breathing, lips, larynx, palate, monotonicity, tongue, and intelligibility. In this paper, we propose using Large Language Models (LLMs) to generate FDA-like text reports from audio recordings. Furthermore, we improve \textit{user control} over the input to the LLM by extracting acoustic biomarkers (correlated with the categories from the mFDA) and using them as prompts to the language model. For this, we used speech recordings from 50 Parkinson's disease (PD) patients and 50 healthy controls (HC), whose audio recordings were assessed by three SLTs according to the mFDA.Structured reports are generated by feeding acoustic biomarkers that are extracted from the speech signals. For this, we only use acoustic biomarkers that are correlated to the seven categories of the mFDA.The results demonstrate that the LLMs can generate reports with a BLEU score of 0.789 for PD and 0.836 for HC, showing the potential of our proposed approach for practical medical applications.

Version published to 10.21203/rs.3.rs-7326708/v1 on Research Square
Aug 19, 2025

Analysis: Serving Individuals with Language Impairments using Automatic Speech Recognition Models and Large Language Models: Challenges and Opportunities

This article has 13 authors:
1. Yiyu Shi
2. Ruiyang Qin
3. Haoxinran Yu
4. Lixuan Wei
5. Yuxuan Liu
6. Dancheng Liu
7. Chenhui Xu
8. Jiajie Li
9. Gelei Xu
10. Ahmed Abbasi
11. Jinjun Xiong
12. Xiufan Yu
13. Zhi Zheng
This article has no evaluationsLatest version Jul 24, 2025
Automated Speech-Fluency Explanations for Schizophrenia Diagnosis

This article has 4 authors:
1. Rok Rajher
2. Mila Marinković
3. Polona Rus Prelog
4. Jure Žabkar
This article has no evaluationsLatest version Sep 9, 2025
What is the retest reliability of computationally extractable speech and language markers?

This article has 9 authors:
1. DERYA Cokal
2. Martin Villalba
3. Rui He
4. Claudio Flores Palominos
5. Annkathrin Böke
6. Philipp Homan
7. Klaus von Heusinger
8. Joseph Kambeitz
9. Wolfram Hinzen
This article has no evaluationsLatest version Jul 28, 2025

Listed in

Abstract

Article activity feed

Related articles

Analysis: Serving Individuals with Language Impairments using Automatic Speech Recognition Models and Large Language Models: Challenges and Opportunities

Automated Speech-Fluency Explanations for Schizophrenia Diagnosis

What is the retest reliability of computationally extractable speech and language markers?