MAHE: A Multiscale and Hybrid Expert-based Model for Image-Text Enhanced Named Entity Recognition on Social Media

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In the field of cybersecurity, verifying the authenticity of user identities is critical for combating fake accounts, bots, and malicious users. Although existing Multimodal Named Entity Recognition (MNER) methods have made some progress in cybersecurity, most rely on extracting visual features through image encoders and directly inputting them into cross-modal attention mechanisms. This approach often struggles to accurately align text with semantic understanding of images in complex network environments. To address this issue and improve both the accuracy and efficiency of identity verification, this paper proposes a novel framework: an MNER model based on the joint effect of multi-scale Mamba and a hybrid expert mechanism for modality enhancement. The model leverages the hybrid expert mechanism to enhance text recognition and employs the Mamba model's channel attention and local enhancement to generate high-resolution and multi-scale image features. This allows for a more comprehensive analysis of user-generated text and images, ensuring effective distinction between real users and fake or automated accounts, thereby improving the effectiveness of online identity verification. Experimental results show F1 scores of 75.34 and 87.41 on the Twitter-2015 and Twitter-2017 datasets, respectively. This approach demonstrates strong potential and competitiveness compared to state-of-the-art models in online identity verification for cybersecurity tasks.

Article activity feed