Comparing the Performance of SOTA Text Summarization Models on AI Research Papers
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
In the realm of academics and research work, efficient, accurate and privacy-focused summarization of research papers have emerged as significant areas of interest. The availability of such a tool would make the tasks like literature survey much easier and faster, enabling academics to channel their energy to other aspects of their work. In this work the best summarization model is identified by comparing the performance of four State Of The Art (SOTA) text summarization models. These models are Facebook’s Bart Large CNN, Phil Schmid’s Bart Large CNN SAMSum (Bart Large CNN fine-tuned on the SAMSum dataset), Sam Shleifer’s DistilBART CNN 12 6 and Google’s PEGASUS CNN Dailymail. The initial part of this work focuses on evaluating the summaries generated by the Vanilla models. This dataset has been obtained from the AI Arxiv2 Dataset, which contains a diverse range of information about research papers published in ArXiv under the AI domain. The generated summaries are evaluated using metrics like BLEU, ROUGE, BERTScore and METEOR. The models are compared to determine the best model for summarization. This work involves fine-tuning the Vanilla models on a separate train dataset with 1,377 examples obtained from the AI Arxiv2 Dataset, in an attempt to improve their ability to summarize text containing AI jargon and terminology. The later part focuses on evaluating the summaries generated by these fine-tuned models on the test dataset with 326 examples.