A Lightweight, Explainable Spam Detection System with Rüppell’s Fox Optimizer for the Social Media Network X
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Effective spam detection systems are essential in online social media networks (OSNs) and cybersecurity, and they directly influence the quality of decision-making pertaining to security. With today’s digital communications, unsolicited spam degrades user experiences and threatens platform security. Machine learning-based spam detection systems offer an automated defense. Despite their effectiveness, such methods are frequently hindered by the “black box” problem, an interpretability deficiency that constrains their deployment in security applications, which, in order to comprehend the rationale of classification processes, is crucial for efficient threat evaluation and response strategies. However, their effectiveness hinges on selecting an optimal feature subset. To address these issues, we propose a lightweight, explainable spam detection model that integrates a nature-inspired optimizer. The approach employs clean data with data preprocessing and feature selection using a swarm-based, nature-inspired meta-heuristic Rüppell’s Fox Optimization (RFO) algorithm. To the best of our knowledge, this is the first time the algorithm has been adapted to the field of cybersecurity. The resulting minimal feature set is used to train a supervised classifier that achieves high detection rates and accuracy with respect to spam accounts. For the interpretation of model predictions, Shapley values are computed and illustrated through swarm and summary charts. The proposed system was empirically assessed using two datasets, achieving accuracies of 99.10%, 98.77%, 96.57%, and 92.24% on Dataset 1 using RFO with DT, KNN, AdaBoost, and LR and 98.94%, 98.67%, 95.04%, and 94.52% on Dataset 2, respectively. The results validate the efficacy of the suggested approach, providing an accurate and understandable model for spam account identification. This study represents notable progress in the field, offering a thorough and dependable resolution for spam account detection issues.