USING LARGE LANGUAGE MODELS IN SHORT TEXT TOPIC MODELING: MODEL CHOICE AND SAMPLE SIZE
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This study explores the efficacy of large language models (LLMs) in short-text topic modeling,comparing their performance with human evaluation and Latent Dirichlet Allocation (LDA). InStudy 1, we analyzed a dataset on chatbot anthropomorphism using human evaluation, LDA, andtwo LLMs (GPT-4 and Claude). Results showed that LLMs produced topic classifications similar tohuman analysis, outperforming LDA for short texts. In Study 2, we investigated the impact of samplesize and LLM choice on topic modeling consistency using a COVID-19 vaccine hesitancy dataset.Findings revealed high consistency (80-90%) across various sample sizes, with even a 5% sampleachieving 90% consistency. Comparison of three LLMs (Gemini Pro 1.5, GPT-4o, and Claude 3.5Sonnet) showed comparable performance, with two models achieving 90% consistency. This research demonstrates that LLMs can effectively perform short-text topic modeling in medical informatics, offering a promising alternative to traditional methods. The high consistency with small sample sizes suggests potential for improved efficiency in research. However, variations in performance highlight the importance of model selection and the need for human supervision in topic modeling tasks.