Automated Detection of Speech Disorders in Parkinson’s Disease using Deep Convolutional Neural Networks: A Pilot Study

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background

Patients with Parkinson’s disease (PD) frequently exhibit deficits in functional communication due to the presence of speech disorders associated with dysarthria that can be characterized by monotony of pitch (or fundamental frequency), reduced loudness, irregular rate of speech, imprecise consonants, and changes in voice quality. This pilot study investigates the application of a speech classifier based on deep-convolutional neural networks (CNNs) for aiding early diagnosis of PD.

Methods

In this study, we analyse the performance capabilities of two audio feature extraction techniques and associated model architectures: low-level time-frequency based features classified using a Support Vector Machine (SVM); and classifying log mel-spectrograms of segmented audio signals using varying depths of Deep Convolutional Neural Networks. The models were trained using an open-source data set comprised of 73 audio recordings of continuous dialogue from 37 subjects, including 16 people with PD (5 females and 11 males) and 21 healthy controls (19 females and 2 males), who were required to perform two speech production tasks.

Results

The experimental results show that the deep CNN model, trained on the log mel-spectrograms of 5-second segmented audio signals, can successfully differentiate PD subjects from healthy controls (HC) with a mean accuracy of 84.7%, sensitivity of 87.9% sensitivity and specificity of 89.4%, thus demonstrating its potential for aiding early diagnosis of PD in a clinical setting. The saliency maps show that the deep CNN model can distinguish between PD participants and healthy controls by detecting centralised, low-frequency regions of the spectrograms representing the speech of PD subjects, whereas a larger range of frequencies are detected in the spectrograms representing speech from healthy controls.

Article activity feed