Pre-Processing Techniques on Abstractive Text Summarization for Gujarati Language
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
In the era of information overload, text summarization has become a crucial tool for extracting key information from vast amounts of text data. Abstractive summarization, which generates concise summaries while preserving the core meaning, is particularly challenging for languages like Gujarati due to its unique linguistic characteristics. This research paper focuses on the pre-processing phase of abstractive text summarization for Gujarati language, exploring various techniques such as noise removal, tokenization, stop-word removal, stemming/lemmatization, and sentence segmentation. By evaluating the impact of these pre-processing techniques on the quality of generated summaries using various evaluation scores, we aim to identify the most effective pre-processing methods for Gujarati text summarization. Our findings highlight the significance of pre-processing in improving summarization quality and provide insights for future research directions in this domain.