Exploring Trends and Taxonomies in Survey Papers on Large Language Models through Data Science

Aayushi Rajput

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This report provides an in-depth analysis of survey papers within a specific dataset, utilizing various data science techniques. The main goal is to investigate, manipulate, and evaluate the data to uncover trends and distributions of taxonomy related to surveys in this field. The exploration phase began with a time-series analysis of survey releases, allowing for the visualization of trends over time. Subsequently, taxonomy distributions were analyzed through bar and pie charts to identify the most common categories. A feature matrix was created in the manipulation phase by implementing TF-IDF vectorization on the textual components (titles and summaries) and one-hot encoding for categorical variables. These features were then normalized and divided into training and testing sets to facilitate model evaluation. For the evaluation process, a Random Forest classifier was employed to predict the taxonomy of surveys based on the extracted features. Performance metrics accuracy and precision were utilized, with the model achieving an accuracy of 56.89\%. Other models such as LinearSVC, and Logistic Regression were also used for data evaluation and they gave approximately the same accuracy as that of random forest classifier. While this result suggests significant room for improvement, it highlights the potential of machine learning to automate the classification of survey papers based on their content. This analysis demonstrates how data science methods, including natural language processing (NLP) and machine learning, can be leveraged to discern trends, conduct feature engineering, and assess models in the context of survey data. Future research may focus on integrating more sophisticated models and feature selection strategies to enhance predictive accuracy.

Version published to 10.31224/3980
Oct 2, 2024

Survey Trends using LLM Models

This article has 1 author:
1. Tasvi Adappa
This article has no evaluationsLatest version Oct 3, 2024
Exploring Large Language Model survey papers via Machine and Ensemble Learning

This article has 1 author:
1. Mehenaz Afrin
This article has no evaluationsLatest version Oct 2, 2024
LLM Survey Analysis Using Random Forest

This article has 1 author:
1. Priyanka Singla
This article has no evaluationsLatest version Oct 3, 2024

Listed in

Abstract

Article activity feed

Related articles

Survey Trends using LLM Models

Exploring Large Language Model survey papers via Machine and Ensemble Learning

LLM Survey Analysis Using Random Forest