Emotion-Aware Multimodal Framework with Similarity-Guided Gating for Disaster Misinformation Detection
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The rapid dissemination of disaster-related misinformation on social media poses serious risks to public safety and data-driven crisis response. Detecting such content remains challenging because misleading posts often combine emotionally charged text with reused, irrelevant, or weakly related images. This multimodal and affective complexity limits the effectiveness of text-only models and reduces the reliability of existing multimodal approaches.This paper proposes an emotion-aware multimodal framework for disaster misinformation detection that jointly models text, images, emotional signals, and image–text similarity. The approach utilizes cross-modal attention and similarity-guided gating to regulate visual contributions based on semantic alignment, thereby enhancing robustness to misleading images.Experiments on the large-scale Fakeddit dataset show that the proposed method achieves 0.943 accuracy and 0.942 F1-score , outperforming multimodal baselines. When transferred to the Crisis dataset, it attains 0.895 accuracy and 0.873 F1-score , indicating effective cross-domain generalization. Additional robustness experiments confirm stability under modality degradation, while analysis of input features highlights the complementary roles of domain-specific emotional cues and domain-invariant image–text inconsistency.These results suggest the value of combining affective signals with alignment-aware multimodal analysis for reliable disaster misinformation detection.