Deep Learning for Multimodal Sentimental Analysis Using Long-Short Term Memory

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The increasing popularity of mobile devices and social networks has leads people to share images and text to express their emotions and opinions. This paper proposed a deep learning-based technique for multimodal sentimental analysis. For this work, the CMU-MOSI and CMU-MOSEI dataset is utilized which contains video, audio and text clips. The inputs are in the form of video, audio and text, so individual preprocessing and feature extraction techniques are performed with separate techniques. Then, the extracted features from separate techniques are concatenated and given as input for the classification of sentimental analysis. The Long-Short Term Memory (LSTM) is utilized in this research for classification. The performance metrics like 7-class accuracy (Acc-7), binary accuracy (Acc-2), F1-score, Mean Absolute Error (MAE) and Correlation Coefficient (CORR) are utilized to evaluate the LSTM model. The attained result shows that the LSTM model obtains better Acc-2 of 91.57% on CMU-MOSI dataset and 91.28% of CMU-MOSEI dataset when compared to existing techniques like Multi-Tensor Fusion Network with Cross-Modal Modeling (MTFN-CMM), Sparse and Cross-Attention Network (SCANET) and Sparse and Cross-Attention Network (SCANET).

Article activity feed