An empirical comparison of ensemble and deep learning models for multi-level Arabic fake news detection using the JoNewsFake dataset

Noor M. Alkudah
Norisma Binti Idris
Mohammad A. M. Abushariah
Aznul Qalid Md Sabri

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

We present a comparative study of ensemble and deep learning architectures for multi-label, multi-level Arabic fake news detection on the JoNewsFake dataset, a dataset of 50,000 Facebook posts from 12 verified Jordanian news agencies annotated with 22 main categories, ~75 subcategories, and Fake/Real labels. The models include Random Forest, Extra Trees, LightGBM, Convolutional Neural Network (CNN) + Bi-LSTM, CNN + Bi-GRU, and a Transformer encoder. All systems use a hybrid representation that concatenates AraBERT semantic embeddings, POS-based syntactic features, and emotion indicators (894 dimensions). Extra Trees yields the strongest overall performance with a Macro F1-score of ≈0.95 (Main), ≈0.98 (Sub), and ≈0.95 (Fake/Real), while the Transformer achieves a Subcategory Macro F1-score of ≈0.93, suggesting that self-attention offers benefits for fine-grained label spaces. We provide a clear account of data collection, filtering, annotation, and evaluation to support reproducibility. The findings show that carefully engineered features paired with efficient ensembles remain highly competitive for Arabic news, whereas Transformer encoders excel when hierarchical granularity increases. This work offers a rigorous, data-driven baseline for Arabic fake news detection across multiple classification levels and can guide future extensions ( e.g ., cross-dialect coverage and cross-platform validation).

Version published to 10.7717/peerj-cs.3510
Feb 4, 2026
Version published to 10.21203/rs.3.rs-7424709/v1 on Research Square
Sep 16, 2025

A Dual-Architecture Deep Learning Pipeline for Real-Time High-Accuracy Arabic Sign Language Recognition

This article has 3 authors:
1. Asmaa Youssef
2. Amira Gaber
3. Shereen M. El-Metwally
This article has no evaluationsLatest version Feb 4, 2026
Enhanced Quantum-Inspired Deep Learning with Multi-Head Attention and Contrastive Learning for Multimodal Emotion Recognition in Human-Computer Interaction

This article has 8 authors:
1. Fumin Zou
2. Lei Zou
3. Feng Guo
4. Xunhuang Wang
5. Jianqing Weng
6. Tao Fang
7. Haocai Jiang
8. Xueming Wu
This article has no evaluationsLatest version Apr 1, 2026
An Intelligent Handwritten Malayalam Text Recognition Using Optimized Machine Learning Based Optimization Frameworks

This article has 2 authors:
1. Gargy G
2. A Shajin Nargunam
This article has no evaluationsLatest version Mar 27, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

A Dual-Architecture Deep Learning Pipeline for Real-Time High-Accuracy Arabic Sign Language Recognition

Enhanced Quantum-Inspired Deep Learning with Multi-Head Attention and Contrastive Learning for Multimodal Emotion Recognition in Human-Computer Interaction

An Intelligent Handwritten Malayalam Text Recognition Using Optimized Machine Learning Based Optimization Frameworks