Pre-Processing Techniques on Abstractive Text Summarization for Gujarati Language

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In the era of information overload, text summarization has become a crucial tool for extracting key information from vast amounts of text data. Abstractive summarization, which generates concise summaries while preserving the core meaning, is particularly challenging for languages like Gujarati due to its unique linguistic characteristics. This research paper focuses on the pre-processing phase of abstractive text summarization for Gujarati language, exploring various techniques such as noise removal, tokenization, stop-word removal, stemming/lemmatization, and sentence segmentation. By evaluating the impact of these pre-processing techniques on the quality of generated summaries using various evaluation scores, we aim to identify the most effective pre-processing methods for Gujarati text summarization. Our findings highlight the significance of pre-processing in improving summarization quality and provide insights for future research directions in this domain.

Article activity feed