AI vs. Human: Decoding Text Authenticity with Transformers

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In an era where the proliferation of large language models blurs the lines between human and machine-generated content, discerning text authenticity is paramount. This study investigates transformer-based language models—BERT, RoBERTa, and DistilBERT—in distinguishing human-written from machine-generated text. By leveraging a comprehensive corpus, including human-written text from sources such as Wikipedia, WikiHow, various news articles in different languages, and texts generated by OpenAI's GPT-2, we conduct rigorous comparative experiments. Our findings highlight the superior effectiveness of ensemble learning models over single classifiers in this critical task. This research underscores the versatility and efficacy of transformer-based methodologies for a wide range of natural language processing applications, significantly advancing text authenticity detection systems. The results demonstrate competitive performance, with the transformer-based method achieving an F-score score of 0.83 with RoBERTa-large (monolingual) and 0.70 with DistilBERT-base-uncased (multilingual).

Article activity feed