Authorship Attribution in Hindi Literary Texts: An Exploration of Traditional Linguistic Approaches and Experimentation with Multilingual BERT

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Authorship attribution, the attribution of a manuscript to an author, has been successfully carried out in English Literature. This study hypothesizes that authorship attribution is better encoded in the semantic structures of Hindi literature rather than its linguistic features. The work serves as a venture into the aforementioned notion by contrasting m-BERT with traditional stylometric methods such as n-grams, Bag of Words (BoW), and Term Frequency-Inverse Document Frequency(TF-IDF), on a curated dataset of Hindi stories. Our findings reveal that, in the domain of authorship attribution for Hindi stories, traditional methods exhibit greater effectiveness compared to the modern m-BERT approach. The dataset preparation, research methodology, and results have been elucidated, as well as a thorough discussion on the insights derived from the findings regarding the future scope of this work.

Article activity feed