Classifying Survey Papers on Large Language Models Using Machine

Chiaying Wu

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

With the rapid development of research on large language models (LLMs), the number of related survey papers is continuously increasing, posing challenges for scholars and researchers navigating this field. This study aims to apply machine learning techniques to effectively classify survey papers on LLMs, thereby providing researchers with better literature retrieval and analysis tools. I first constructed a diverse feature matrix by integrating text data and class labels from different datasets. Using preprocessing methods such as TF-IDF vectorization and one-hot encoding, I prepared for subsequent model training. In the experiments, I implemented a random forest classifier to analyze the relationship between features and classification labels. Preliminary results indicate that my classification model achieved an accuracy of 26% without parameter tuning, which improved to 31% after hyperparameter optimization. Despite challenges such as class imbalance and feature correlation, my research provides an effective method for the automatic classification of survey papers. Future work will focus on further improving classification accuracy and exploring other machine learning algorithms to expand the applicability of such tasks.

Version published to 10.31224/3970
Oct 2, 2024

Parameter-Efficient Fine-Tuning (PEFT) Approaches for Large Language Models: A Comparative Analysis on AG News

This article has 1 author:
1. Asmaa Mohammed Shuibi
This article has no evaluationsLatest version Oct 10, 2025
Improving Ensemble Models for Software Defect Prediction: a study applying preprocessing techniques

This article has 2 authors:
1. Bianca P. R. Vieira
2. Rogério E. Garcia
This article has no evaluationsLatest version Sep 19, 2025
Evaluation Metrics in Learning Systems: A Survey

This article has 5 authors:
1. Anahita Nouri
2. Baqer M. Merzah
3. Sahand Mosayyebpour
4. Ramin Mousa
5. Saba Hesaraki
This article has no evaluationsLatest version Aug 21, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Parameter-Efficient Fine-Tuning (PEFT) Approaches for Large Language Models: A Comparative Analysis on AG News

Improving Ensemble Models for Software Defect Prediction: a study applying preprocessing techniques

Evaluation Metrics in Learning Systems: A Survey