Deep Temporal Features and Multi-Level Cross-Modal Attention Fusion for Multimodal Sentiment Analysis

Min Zhu

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

To address the challenges of insufficient multimodal feature extraction and limited cross-modal semantic diversity and interaction in multimodal sentiment analysis, this paper introduces Deep Temporal Features and Multi-Level Cross-Modal Attention Fusion (DTMCAF). Initially, a deep temporal feature extractor is developed, creating a multimodal temporal modeling network that combines bidirectional LSTMs with multi-head self-attention to capture multimodal features. Next, hierarchical cross-modal attention mechanisms along with feature-enhancement attention modules are designed to facilitate thorough information exchange between different modalities. Additionally, gated fusion and multi-layer feature transformations are employed to strengthen multimodal representations. Lastly, a multi-component collaborative loss function is proposed to align cross-modal features and optimize sentiment representations. Comprehensive experiments conducted on the CMU-MOSI and CMU-MOSEI datasets demonstrate that the proposed method outperforms current state-of-the-art techniques in terms of correlation, accuracy, and F1 score, significantly enhancing the precision of multimodal sentiment analysis.

Version published to 10.21203/rs.3.rs-7521327/v1 on Research Square
Sep 11, 2025

A capsule-based hierarchical graph reasoning model incorporating homologous and heterogeneous information for sentiment analysis

This article has 1 author:
1. Kexin Zhang
This article has no evaluationsLatest version Sep 9, 2025
DDCAF: Dynamic Dual Cross-Attention Fusion Framework for Multimodal Hate Speech Detection

This article has 3 authors:
1. Gauri Kitukale
2. Navneet Pratap Singh
3. Sidharth Quamara
This article has no evaluationsLatest version Sep 15, 2025
A Deep Learning Approach for Multilingual Sentiment Analysis

This article has 3 authors:
1. Bablu Pramanik
2. Santanu Modak
3. Chayan Paul
This article has no evaluationsLatest version Sep 11, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

A capsule-based hierarchical graph reasoning model incorporating homologous and heterogeneous information for sentiment analysis

DDCAF: Dynamic Dual Cross-Attention Fusion Framework for Multimodal Hate Speech Detection

A Deep Learning Approach for Multilingual Sentiment Analysis