Classification of Speech and Associated EEG Responses from Normal-Hearing and Cochlear Implant Talkers Using Support Vector Machines

Shruthi Raghavendra
Sungmin Lee
Chin-Tuan Tan

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background/Objectives: Speech produced by individuals with hearing loss differs notably from that of normal-hearing (NH) individuals. Although cochlear implants (CIs) provide sufficient auditory input to support speech acquisition and control, there remains considerable variability in speech intelligibility among CI users. As a result, speech produced by CI talkers often exhibits distinct acoustic characteristics compared to that of NH individuals. Methods: Speech data were obtained from eight cochlear-implant (CI) and eight normal-hearing (NH) talkers, while electroencephalogram (EEG) responses were recorded from 11 NH listeners exposed to the same speech stimuli. Support Vector Machine (SVM) classifiers employing 3-fold cross-validation were evaluated using classification accuracy as the performance metric. This study evaluated the efficacy of Support Vector Machine (SVM) algorithms using four kernel functions (Linear, Polynomial, Gaussian, and Radial Basis Function) to classify speech produced by NH and CI talkers. Six acoustic features—Log Energy, Zero-Crossing Rate (ZCR), Pitch, Linear Predictive Coefficients (LPC), Mel-Frequency Cepstral Coefficients (MFCCs), and Perceptual Linear Predictive Cepstral Coefficients (PLP-CC)—were extracted. These same features were also extracted from electroencephalogram (EEG) recordings of NH listeners who were exposed to the speech stimuli. The EEG analysis leveraged the assumption of quasi-stationarity over short time windows. Results: Classification of speech signals using SVMs yielded the highest accuracies of 100% and 94% for the Energy and MFCC features, respectively, using Gaussian and RBF kernels. EEG responses to speech achieved classification accuracies exceeding 70% for ZCR and Pitch features using the same kernels. Other features such as LPC and PLP-CC yielded moderate to low classification performance. Conclusions: The results indicate that both speech-derived and EEG-derived features can effectively differentiate between CI and NH talkers. Among the tested kernels, Gaussian and RBF provided superior performance, particularly when using Energy and MFCC features. These findings support the application of SVMs for multimodal classification in hearing research, with potential applications in improving CI speech processing and auditory rehabilitation.

Version published to 10.3390/audiolres15060158
Nov 18, 2025
Version published to 10.20944/preprints202511.0965.v1
Nov 13, 2025

Explainable Eeg for Auditory Attention Decoding

This article has 1 author:
1. Fatma ÖZCAN
This article has no evaluationsLatest version Jan 13, 2026
Perceptual learning and sensorimotor adaptation with cochlear-implant simulated speech feedback

This article has 4 authors:
1. Abigail Bradshaw
2. Clément Gaultier
3. Susie Black
4. Matthew H. Davis
This article has no evaluationsLatest version Dec 16, 2025
Rhythm modulates perception and neural tracking of speech in a speech-in-noise task

This article has 4 authors:
1. Eloise Schell
2. Tzu-Han Zoe Cheng
3. Yi Shen
4. T. Christina Zhao
This article has no evaluationsLatest version Jan 6, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Explainable Eeg for Auditory Attention Decoding

Perceptual learning and sensorimotor adaptation with cochlear-implant simulated speech feedback

Rhythm modulates perception and neural tracking of speech in a speech-in-noise task