Automated Detection of Speech Disorders in Parkinson’s Disease using Deep Convolutional Neural Networks: A Pilot Study

Sara A. Jones
Jeremy Cosgrove
He Wang
Ryan K. Mathew

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

Patients with Parkinson’s disease (PD) frequently exhibit deficits in functional communication due to the presence of speech disorders associated with dysarthria that can be characterized by monotony of pitch (or fundamental frequency), reduced loudness, irregular rate of speech, imprecise consonants, and changes in voice quality. This pilot study investigates the application of a speech classifier based on deep-convolutional neural networks (CNNs) for aiding early diagnosis of PD.

Methods

In this study, we analyse the performance capabilities of two audio feature extraction techniques and associated model architectures: low-level time-frequency based features classified using a Support Vector Machine (SVM); and classifying log mel-spectrograms of segmented audio signals using varying depths of Deep Convolutional Neural Networks. The models were trained using an open-source data set comprised of 73 audio recordings of continuous dialogue from 37 subjects, including 16 people with PD (5 females and 11 males) and 21 healthy controls (19 females and 2 males), who were required to perform two speech production tasks.

Results

The experimental results show that the deep CNN model, trained on the log mel-spectrograms of 5-second segmented audio signals, can successfully differentiate PD subjects from healthy controls (HC) with a mean accuracy of 84.7%, sensitivity of 87.9% sensitivity and specificity of 89.4%, thus demonstrating its potential for aiding early diagnosis of PD in a clinical setting. The saliency maps show that the deep CNN model can distinguish between PD participants and healthy controls by detecting centralised, low-frequency regions of the spectrograms representing the speech of PD subjects, whereas a larger range of frequencies are detected in the spectrograms representing speech from healthy controls.

Version published to 10.1101/2025.07.18.25331737 on medRxiv
Jul 18, 2025

Acoustic-Driven Generation of Pathological Speech Reports Using Large Language Models

This article has 9 authors:
1. Tomas Arias-Vergara
2. Lukas Buess
3. Nastassia Vysotskaya
4. Soroosh Tayebi Arasteh
5. Juan Rafael Orozco-Arroyave
6. Maria Schuster
7. Elmar Noeth
8. Andreas Maier
9. Paula Andrea Perez-Toro
This article has no evaluationsLatest version Aug 19, 2025
Neural processing of natural speech by children with developmental language disorder (DLD): EEG speech decoding, power and classifier investigations

This article has 5 authors:
1. Mahmoud Keshavarzi
2. Susan Richards
3. Georgia Feltham
4. Lyla Parvez
5. Usha Goswami
This article has no evaluationsLatest version Jul 18, 2025
Automated EEG-Based Classification of Nonclinical Depressive States via the Integration of Automatic Speech Recognition and a Pretrained Language Model

This article has 7 authors:
1. Hiroki Watanabe
2. Aya S. Ihara
3. Masato Okada
4. Sakriani Sakti
5. Mitsuyoshi Tachimori
6. Etsuo Mizukami
7. Yasushi Naruse
This article has no evaluationsLatest version Jul 14, 2025

Listed in

Abstract

Background

Methods

Results

Article activity feed

Related articles

Acoustic-Driven Generation of Pathological Speech Reports Using Large Language Models

Neural processing of natural speech by children with developmental language disorder (DLD): EEG speech decoding, power and classifier investigations

Automated EEG-Based Classification of Nonclinical Depressive States via the Integration of Automatic Speech Recognition and a Pretrained Language Model