Survey Trends using LLM Models

Tasvi Adappa

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This report outlines a comprehensive analysis of survey papers within a specific dataset using various data science techniques. The primary objective is to explore, manipulate, and evaluate the data to understand the trends and taxonomy distributions of surveys in this domain. Data exploration began with a time-series analysis of survey releases, visualizing trends over time. Taxonomy distributions were then examined using bar charts and pie charts to uncover the most frequent categories. In the data manipulation phase, we constructed a feature matrix by applying TF-IDF vectorization to the text fields (titles and summaries) and using one-hot encoding for the categorical variables. These features were then normalized and split into training and testing sets to prepare for model evaluation. The data evaluation process employed a Random Forest classifier to predict the taxonomy of surveys based on the features extracted. Performance was measured using accuracy, precision, recall, and F1-score, with the model achieving an accuracy of 34.48 percentage. Although the model's performance indicates room for improvement, this analysis demonstrates the potential of machine learning in automating the classification of survey papers based on their content. This study illustrates how data science techniques, including natural language processing (NLP) and machine learning, can be applied to understand trends, perform feature engineering, and evaluate models in the context of survey data. Future work could involve the use of more advanced models and feature selection techniques to enhance predictive accuracy.

Version published to 10.31224/3977
Oct 3, 2024

ChatGPT as an Artificial Intelligence Tool to Provide an Analytical Boost to Text Mining: An Application to TUCKER Models for Multiway Tables

This article has 3 authors:
1. Roberto Cascante-Yarlequé
2. Purificación Galindo-Villardón
3. Fabricio Guevara-Viejó
This article has no evaluationsLatest version Mar 25, 2026
A large-scale, granular topic classification system for scientific documents

This article has 3 authors:
1. Gard B. Jenset
2. Peter J. Bevan
3. Akarsh Jain
This article has no evaluationsLatest version Mar 31, 2026
AI for Survey Design: Generating and Evaluating Survey Questions with Large Language Models

This article has 3 authors:
1. Anna Fuchs
2. Anna-Carolina Haensch
3. Wiebke Weber
This article has no evaluationsLatest version Mar 12, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

ChatGPT as an Artificial Intelligence Tool to Provide an Analytical Boost to Text Mining: An Application to TUCKER Models for Multiway Tables

A large-scale, granular topic classification system for scientific documents

AI for Survey Design: Generating and Evaluating Survey Questions with Large Language Models