BERT-Based Fine-Tuning for Efficient Context Similarity Analysis

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Contextual similarity detection has emerged as a critical need in academic publishing, particularly for high-impact journals such as IEEE. Traditional plagiarism detection methods are often insufficient, as they primarily rely on exact text matching, making it easy to bypass them through rephrasing. This study presents a fine-tuned BERT model designed specifically to evaluate the contextual similarity of academic papers. Utilizing a curated dataset of 8,000 research papers, equally sourced from arXiv and Semantic Scholar, the model achieved an accuracy of 92.52%. The proposed approach aims to enhance the integrity of academic publishing by effectively identifying duplicate content, even when paraphrased. This paper outlines the methodology, experimental results, and implications for improving originality checks in academic publishing along with reducing the chances of duplicate papers being published in any journals.

Article activity feed