A Mathematical Comparison of Machine Learning and Deep Learning Models for Automated Fake News Detection
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Detecting fake news is a critical challenge in natural language processing (NLP), demanding solutions that balance accuracy, interpretability, and computational efficiency. In this study, we systematically evaluate the mathematical foundations and empirical performance of five representative models for automated fake news classification: three classical machine learning algorithms (Logistic Regression, Random Forest, and Light Gradient Boosting Machine) and two state-of-the-art deep learning architectures (A Lite Bidirectional Encoder Representations from Transformers—ALBERT, and Gated Recurrent Units—GRU). Leveraging the large-scale WELFake dataset, we conduct rigorous experiments under both headline-only and headline-plus-content input scenarios, providing a comprehensive assessment of each model’s capability to capture linguistic, contextual, and semantic cues. We analyze each model’s optimization framework, decision boundaries, and feature importance mechanisms, highlighting the mathematical tradeoffs between representational capacity, generalization, and interpretability. Our results reveal that transformer-based models, particularly ALBERT, achieve state-of-the-art performance, especially when rich textual context is available. Classical ensemble models remain competitive for resource-constrained and interpretable applications. This work advances the mathematical discourse on NLP by bridging theoretical model properties and practical deployment strategies for misinformation detection in high-dimensional, real-world text data.