Investigating Hybrid Deep Learning Architectures for Speech Envelope Reconstruction from EEG

Uday Sankar Gottipalli
Aditi Jha
Krishna Miyapuram

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Reconstructing speech envelopes from electroen-cephalography (EEG) signals is a challenging but valuable task for brain-computer interfaces (BCIs), with applications in assistive communication for individuals with speech impairments. While deep learning has improved reconstruction accuracy, most existing approaches are restricted to single-layer architectures such as convolutional neural networks (CNNs). This limits their ability to capture the full complexity of spatio-temporal and structural EEG patterns. In this work, we systematically extend the VLAAI framework by evaluating 26 architectures that integrate CNNs, long short-term memory networks (LSTMs), and graph convolutional networks (GCNs) in both single-layer and hybrid configurations. Experiments on the 64-channel Spar-rKULee dataset demonstrate that CNNs remain the strongest standalone models, but hybrid designs—particularly CNN-LSTM and CNN-GCN-LSTM—achieve competitive or superior performance. These results highlight the importance of combining spatial, temporal, and graph-based processing, and provide practical guidelines for hybrid architecture design. Our study offers the first large-scale comparative analysis of hybrid models for EEG-based speech envelope reconstruction, advancing robust BCI systems for non-invasive speech decoding.

Version published to 10.64898/2026.05.24.727471 on bioRxiv
May 27, 2026

Interpretable Modeling Reveals Population-Level Representation Differences in P300 Brain Computer Interfaces Across Neurodivergent and Neurotypical Cohorts

This article has 7 authors:
1. Xiaowei Jiang
2. Sudong Shang
3. Adrian Wilkinson
4. Michael Platt
5. Da Xiao
6. Beining Cao
7. Thomas Do
This article has no evaluationsLatest version Apr 27, 2026
Machine learning-based decoding of emotional valence from electrophysiological signals in the monkey brain

This article has 5 authors:
1. Nakamura Shinya
2. Tuo Xiaoying
3. Watanabe Hidenori
4. Takuya Sasaki
5. Ken-Ichiro Tsutsui
This article has no evaluationsLatest version May 18, 2026
Comparison of Machine Learning Surrogate Models for Prediction of Single-Fiber Activation in Deep Brain Stimulation

This article has 6 authors:
1. Jorge Alberto
2. Benjamin Norbom
3. Justin Golabek
4. Joshua K. Wong
5. Matthew Schiefer
6. Erin E. Patrick
This article has no evaluationsLatest version May 15, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Interpretable Modeling Reveals Population-Level Representation Differences in P300 Brain Computer Interfaces Across Neurodivergent and Neurotypical Cohorts

Machine learning-based decoding of emotional valence from electrophysiological signals in the monkey brain

Comparison of Machine Learning Surrogate Models for Prediction of Single-Fiber Activation in Deep Brain Stimulation