Speech-Based Parkinson's Detection Using Pre-Trained Self-Supervised Automatic Speech Recognition (ASR) Models and Supervised Contrastive Learning

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Parkinson's disease (PD) through speech analysis is a promising area of research, as speech impairments are often one of the early signs of the disease. This study explores the potential of automatic speech recognition (ASR) models, namely Wav2Vec 2.0 and HuBERT, for detecting PD through fine-tuning these pre-trained models on speech data and employing transfer learning techniques. These models, pretrained on large unlabeled datasets, can be capable of learning rich speech representations that capture acoustic markers of PD. The study also proposes the integration of a supervised contrastive learning (SupCon) approach to enhance the models' ability to distinguish PD-specific features. Additionally, the proposed ASR-based features were compared against two common acoustic feature sets: mel-frequency cepstral coefficients (MFCCs) and the extended Geneva minimalistic acoustic parameter set (eGeMAPS) as baseline. We also employ gradient-based methods, Grad-CAM, to visualize important speech regions contributing to the models' predictions. The experiments, conducted using the NeuroVoz dataset, demonstrated that features extracted from the pre-trained ASR models exhibited superior performance compared to the baseline features. The results also reveal that integrating SupCon consistently outperforms traditional cross-entropy based models. Wav2Vec2.0 and HuBERT with SupCon achieved the highest F1-scores of 90.0% and 88.99% respectively. Additionally, their AUC scores in the ROC analysis surpassed the cross-entropy models which had comparatively lower AUCs ranging from 0.84 to 0.89. These results highlight the potential of ASR-based models as scalable, non-invasive tools for diagnosing and monitoring PD, offering a promising avenue for early detection and management of this debilitating condition.

Article activity feed