Comparing Large Language Models for Text Classification: Model Selection Across Tasks, Texts, and Languages

Michael Heseltine

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Large-scale text analysis has grown rapidly as an analytic method in the socialsciences and beyond in recent years and recent advances in large language models(LLMs) have made automated text annotation increasingly viable. This paper focuseson the comparative viability of closed-source and open-source LLMS for textannotation, testing the performance of 28 different LLMs in text classification acrossa range of tasks, text types, and languages. Using data in seven languages across10 country contexts, the results show considerable variation in model performance,highlighting that researchers should carefully consider model selection as part of theirLLM-centered classification strategy. In general, the closed-source GPT-4 exhibits relativelystrong performance across all classification tasks, while open-source alternativessuch as LLama3 and Qwen2.5 also show similar or even superior performance on selecttasks. Many smaller open-source models, however, provide relatively unsatisfactoryperformance on more complex and non-English language coding tasks. The tradeoffsinherent in the use of each model are therefore highlighted to allow researchers to makeinformed decisions about model selection on a specific task-by-task basis.

Version published to 10.31219/osf.io/jvgc5_v1 on OSF Preprints
Aug 11, 2025

Don’t Look Up: Evaluating the Tradeoff between Performance and Sustainability of LLMs for Text Analysis.

This article has 3 authors:
1. Sean Palicki
2. Isaac Bravo
3. Clint Claessen
This article has no evaluationsLatest version Aug 14, 2025
Text Psychometrics: Assessing Psychological Constructs in Text Using Natural Language Processing

This article has 4 authors:
1. Daniel Mark Low
2. Patrick Mair
3. Matthew Nock
4. Satrajit S Ghosh
This article has no evaluationsLatest version Aug 11, 2025
Text Psychometrics: Assessing Psychological Constructs in Text Using Natural Language Processing

This article has 4 authors:
1. Daniel Mark Low
2. Patrick Mair
3. Matthew Nock
4. Satrajit S Ghosh
This article has no evaluationsLatest version Aug 11, 2025

Listed in

Abstract

Article activity feed

Related articles

Don’t Look Up: Evaluating the Tradeoff between Performance and Sustainability of LLMs for Text Analysis.

Text Psychometrics: Assessing Psychological Constructs in Text Using Natural Language Processing

Text Psychometrics: Assessing Psychological Constructs in Text Using Natural Language Processing