Artificial Intelligence vs. Human: Decoding Text Authenticity with Transformers

Daniela Gifu
Covaci Silviu-Vasile

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This paper presents a comprehensive study on detecting AI-generated text using transformer models. Our research extends the existing RODICA dataset to create the Enhanced RODICA for Human-Authored and AI-Generated Text (ERH) dataset. We enriched RODICA by incorporating machine-generated texts from various large language models (LLMs), ensuring a diverse and representative corpus. Methodologically, we fine-tuned several transformer architectures, including BERT, RoBERTa, and DistilBERT, on this dataset to distinguish between human-written and AI-generated text. Our experiments examined both monolingual and multilingual settings, evaluating the model’s performance across diverse datasets such as M4, AICrowd, Indonesian Hoax News Detection, TURNBACKHOAX, and ERH. The results demonstrate that RoBERTa-large achieved superior accuracy and F-scores of around 83%, particularly in monolingual contexts, while DistilBERT-multilingual-cased excelled in multilingual scenarios, achieving accuracy and F-scores of around 72%. This study contributes a refined dataset and provides insights into model performance, highlighting the transformative potential of transformer models in detecting AI-generated content.

Version published to 10.3390/fi17010038
Jan 16, 2025
Version published to 10.20944/preprints202407.2014.v1
Jul 25, 2024

LLM-as-Critic: Contrastive and Adversarial Strategies for Authentic Text Verification

This article has 2 authors:
1. Wei Chen
2. Dexin Chen
This article has no evaluationsLatest version Jun 3, 2025
AI-Powered Fake News Detection Tool for Nepali Media

This article has 1 author:
1. Ghimire Plan
This article has no evaluationsLatest version May 30, 2025
Leveraging Natural Language Processing for the Computational Generation of Creative Writing

This article has 2 authors:
1. Owen Graham
2. Olivia Graham
This article has no evaluationsLatest version Jul 8, 2025

Listed in

Abstract

Article activity feed

Related articles

LLM-as-Critic: Contrastive and Adversarial Strategies for Authentic Text Verification

AI-Powered Fake News Detection Tool for Nepali Media

Leveraging Natural Language Processing for the Computational Generation of Creative Writing