Deep Learning for Multimodal Facial Expression Recognition with Bengali Audio Integration

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This study investigates deep learning models for facial expression recognition, integrating Bengali audio feedback. Utilizing a meticulously curated dataset of diverse facial images, each labeled with emotion and corresponding Bengali audio, along with demographic metadata, we evaluated CNN, RNN, and hybrid model performance. We also assessed data augmentation’s impact. Our findings demonstrate that hybrid CNN-RNN models achieved superior accuracy in recognizing expressions and generating appropriate Bengali audio feedback. Furthermore, we analyzed model robustness across demographic groups. This work advances multimodal deep learning, particularly for communication 8 contexts requiring Bengali audio feedback.

Article activity feed