Detecting Machine-Generated Arabic Text Using AraBERT and LSTM: Toward Trustworthy NLP in Low-Resource Languages

Tarek Barhoum
Mina Ibrahim
Mohamad Al Bali

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Deepfake text generation has emerged as a serious challenge in the age of advanced language models, particularly in low-resource languages like Arabic. This study presents a deep learning-based approach to detect synthetic Arabic text generated by AI systems. We propose a binary classification framework combining AraBERT embeddings with a Long Short-Term Memory (LSTM) network. A balanced dataset of 87,452 samples was constructed using real Arabic text and synthetic text generated via AraGPT2. Our best-performing model achieved a test accuracy of 99.5%, demonstrating strong generalization and detection capability. This work contributes to enhancing Arabic NLP security and offers a foundation for future multilingual deepfake detection systems.

Version published to 10.21203/rs.3.rs-7319670/v1 on Research Square
Aug 8, 2025

Detection of Adult Content in Arabic Tweets Using Machine Learning Models

This article has 1 author:
1. Aram Ibrahim Al-anazi
This article has no evaluationsLatest version Sep 17, 2025
FaseehGPT: A Lightweight Transformer Model for Arabic Text Generation with Enhanced Morphological Understanding

This article has 1 author:
1. Ahsan Umar
This article has no evaluationsLatest version Sep 4, 2025
An Empirical Comparison of Ensemble model and Deep Learning Models for Multi-Level Arabic Fake News Classification using JoNewsFake Dataset

This article has 4 authors:
1. Noor M. Alkudah
2. Norisma Idris
3. Aznul Qalid Md Sa
4. Mohammad A. M. Abushariah
This article has no evaluationsLatest version Sep 16, 2025

Listed in

Abstract

Article activity feed

Related articles

Detection of Adult Content in Arabic Tweets Using Machine Learning Models

FaseehGPT: A Lightweight Transformer Model for Arabic Text Generation with Enhanced Morphological Understanding

An Empirical Comparison of Ensemble model and Deep Learning Models for Multi-Level Arabic Fake News Classification using JoNewsFake Dataset