Real-Time Emotion Recognition with CNN and LSTM

Sanmay Kotkar

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

I present two coupled real-time emotion perception pipelines: (1) spatial attention-augmented convolutional neural network (CNN) for face emotion identification, and (2) temporal attention-supported bidirectional long short-term memory (Bi-LSTM) network for speech emotion processing with Mel-frequency cepstral coefficients (MFCCs). Utilizing benchmark sets FER-2013 and RAVDESS, I use state-of-the-art data augmentation techniques (MixUp, CutMix), attention methods, and noise-insensitive preprocessing. My face pipeline is 70%–74% FER-2013 accurate and performs better under different illuminations and occlusions. My speech pipeline is also 82%–85% RAVDESS accurate with additional perturbations due to vocal tract length and filtering speech enhancements. I also provide precision, recall, and class-wise F1-scores, analyze confusion matrices, and have vision-transformer and hybrid CNN-Transformer baselines. Comprehensive discussion includes class imbalance solutions, ethical considerations in emotion AI, multimodal fusion techniques, and paradigms of lifelong learning. I end with directions towards culturally adaptive, light-weight edge deployment and real-world testing protocols.

Version published to 10.20944/preprints202505.0672.v1
May 9, 2025

CNN in Neural Networks for Image-based Face Emotion Identification on Recognition Datasets

This article has 1 author:
1. Monalisa Hati
This article has no evaluationsLatest version Apr 15, 2025
Deep Learning-Based Human Activity Recognition Using Dilated CNN and LSTM on Video Sequences of Various Actions Dataset

This article has 2 authors:
1. Bakht Alam Khan
2. Jin-Woo Jung
This article has no evaluationsLatest version Apr 15, 2025
Facial Emotion Recognition Based on ResNet18 with Multi-Dimensional Attention Mechanisms

This article has 4 authors:
1. 阳西
2. 陈雪吴
3. 天宇孟
4. 昆珍李
This article has no evaluationsLatest version May 8, 2025

Listed in

Abstract

Article activity feed

Related articles

CNN in Neural Networks for Image-based Face Emotion Identification on Recognition Datasets

Deep Learning-Based Human Activity Recognition Using Dilated CNN and LSTM on Video Sequences of Various Actions Dataset

Facial Emotion Recognition Based on ResNet18 with Multi-Dimensional Attention Mechanisms