TERN: Type-Aware Evidence Reasoning for Multimodal Fake News Detection
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The proliferation of multimodal fake news on social media poses a persistent threat to public opinion and information credibility. Existing multimodal detection methods typically adopt a unified fusion strategy for all samples and learn entangled representations where deceptive patterns and semantic content are mixed. As a result, they struggle to (i) account for heterogeneous deception mechanisms such as image tampering and text--image mismatch, and (ii) perform hierarchical reasoning over multiple sources of evidence. To address these limitations, we propose TERN, a type-aware evidence reasoning network for multimodal fake news detection. TERN is built upon three key components. First, a type discovery module leverages contrastive learning and prototype-based clustering in the image feature space to automatically uncover latent deception types without manual annotations. Second, a type--semantic disentanglement module explicitly separates type-discriminative information from content semantics, mitigating spurious correlations between topics and veracity. Third, a type-guided hierarchical evidence fusion module generates textual, visual authenticity, and cross-modal consistency evidence, and adaptively integrates them via type-conditioned attention. Experiments on four public benchmarks---MR2-Chinese, MR2-English, Weibo, and PHEME---demonstrate that TERN outperforms strong multimodal baselines. On average, TERN achieves 93.21\% accuracy and 91.09\% F1-score, with clear gains in Matthews correlation coefficient, indicating more balanced decisions under class imbalance.