Robustness and Explainability in Arabic Authorship Models: Tackling the Challenge of AI-Generated Texts
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The rapid advancement of large language models (LLMs) has intensified challenges in distinguishing human-authored text from AI-generated content, particularly in morphologically rich and dialectally diverse languages such as Arabic. This study investigates the performance, robustness, and explainability of transformer-based and hybrid architectures for the dual tasks of Arabic authorship attribution and AI text detection. Using the AraGenEval corpus, we evaluated a baseline Logistic Regression model, two transformers (AraBERT v2 and XLM-RoBERTa Large), an ensemble combining both, and a hybrid model integrating fine-tuned AraBERT embeddings with handcrafted stylometric features. Results show that the transformer ensemble achieved the highest performance (weighted F1 = 0.95), capturing subtle linguistic distinctions while reducing single-model variance. The addition of stylometric features yielded modest yet consistent improvements in interpretability, highlighting salient lexical and syntactic cues that differentiate human and AI authors. Performance analyses indicated that models remained resilient to diacritic removal but declined notably on dialectal and cross-genre text, underscoring persistent challenges in domain generalization. These findings suggest that integrating transformer ensembles with interpretable, linguistically informed features and domain-diverse corpora can enhance the reliability and transparency of Arabic NLP systems in the era of generative AI.