Large Language Models for Text Classification: From Zero-Shot Learning to Instruction-Tuning
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Advances in large language models (LLMs) have transformed the field of natural language processing and have enormous potential for social scientific analysis. We explore the application of LLMs to supervised text classification. As a case study, we consider stance detection and examine variation in predictive accuracy across different architectures, training regimes, and task specifications. We compare ten models ranging in size from tens of millions to hundreds of billions of parameters and four distinct training regimes: prompt-based zero-shot learning and few-shot learning, fine-tuning using more training data, and instruction-tuning that combines prompting and training data. The largest models generally offer the best predictive performance even with little or no training examples, but fine-tuning smaller models is a competitive solution due to their relatively high accuracy and low cost. For complex prediction tasks, instruction-tuned open-weights models can perform well, rivaling state-of-the-art commercial models. We provide recommendations for the use of LLMs for text classification in sociological research and discuss the limitations and challenges related to the use of these technologies.