A Comparative Analysis of Deep Learning and Machine Learning Approaches for Spam Identification on Telegram

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Spam on messaging apps like Telegram is a serious threat to user security andexperience. In this paper, we compared several machine learning (ML) and deep learning (DL)models to find the most effective way to detect it. We tested our models on a dataset of 20,348messages. We put classic approaches like Logistic Regression and Tree-based Modelsincluding bagging and boosting against modern neural networks—a GRU and the ALBERTtransformer. The results demonstrate that both GRU and ALBERT were the clear winners. TheALBERT model was the top performer, achieving state-of-the-art results with a weightedF1-score of 0.97 and an AUC of 0.9943. The GRU model also delivered excellent performance,with an F1-score of 0.94. Their real strength was in identifying the tricky minority ‘spam’class. Here, ALBERT reached an F1-score of 0.95, and the GRU model scored 0.90,significantly outperforming the other methods. We used McNemar's test to confirm thesefindings were statistically significant. Ultimately, our study sets a new benchmark for spamdetection. It proves that transformer models can effectively secure messaging platforms usingonly the content of the message itself.

Article activity feed