Stop Wasting Time Fine-Tuning: Traditional Classifiers Shine with LLM Embeddings for Political Textual Analysis

Collin Andrew Coil
Nicholas Chen
Caroline L. Bruckner

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Large language models (LLMs) are widely applicable for political science research; however, many researchers do not have access to sufficient compute resources or data to fine-tune LLMs for their research purposes. As such, it is necessary to identify ways to deploy LLMs more efficiently for political science research applications. We examine LLMs for political science classification tasks and show that using LLMs as feature extractors for downstream classification models (an embed-then-classify pipeline) has performance comparable to or exceeding the performance of LLMs fine-tuned for classification (a fine-tune-then-classify pipeline), all while requiring less compute time and data. Furthermore, we demonstrate that both the embed-then-classify and fine-tune-then-classify pipelines significantly outperform zero-shot prompting for classification using decoder-only models, which is prevalent in the social sciences. We present a robust set of experiments with three decoder-only LLMs, 19 encoder-only LLMs, five classification models, and four fine-tuning strategies on a new political classification dataset. This dataset includes over 130,000 text sequences for multi-class classification and is made up of text extracted from a variety of government documents. We further validate our findings on two other political science textual datasets.

Version published to 10.31235/osf.io/2pyfa_v1 on OSF Preprints
Jul 16, 2025

Don’t Look Up: Evaluating the Tradeoff between Performance and Sustainability of LLMs for Text Analysis.

This article has 3 authors:
1. Sean Palicki
2. Isaac Bravo
3. Clint Claessen
This article has no evaluationsLatest version Aug 14, 2025
Comparing Large Language Models for Text Classification: Model Selection Across Tasks, Texts, and Languages

This article has 1 author:
1. Michael Heseltine
This article has no evaluationsLatest version Aug 11, 2025
Evaluating the Effectiveness of Parameter-Efficient Fine-Tuning in Genomic Classification Tasks

This article has 6 authors:
1. Daniel Berman
2. Daniel Jimenez
3. Stanley Ta
4. Brian Merritt
5. Jeremy Ratcliff
6. Vijay Narayan
This article has no evaluationsLatest version Aug 26, 2025

Listed in

Abstract

Article activity feed

Related articles

Don’t Look Up: Evaluating the Tradeoff between Performance and Sustainability of LLMs for Text Analysis.

Comparing Large Language Models for Text Classification: Model Selection Across Tasks, Texts, and Languages

Evaluating the Effectiveness of Parameter-Efficient Fine-Tuning in Genomic Classification Tasks