Parameter-Efficient Fine-Tuning (PEFT) Approaches for Large Language Models: A Comparative Analysis on AG News

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Text classification remains a cornerstone task in Natural Language Processing (NLP), playing a critical role in organizing and understanding large-scale textual data. This study investigates the performance of traditional machine learning algorithms and transformer-based architectures on the AG News dataset, a widely used benchmark for multi-class news classification. In particular, the performance of the DistilBERT model and its Low-Rank Adaptation (LoRA)-enhanced version are examined under a consistent experimental framework that includes different vectorization techniques and parameter configurations. The classical models are evaluated using Count Vectorizer, TF-IDF, Hashing Vectorizer, and semantic embeddings via Word2Vec (CBOW and Skip-gram), while transformer-based models are fine-tuned with varying batch sizes, input lengths, and epochs. The experimental results demonstrate that traditional classifiers, such as Ridge Classifier and Complement Naive Bayes, achieve strong performance when paired with TF-IDF and Count Vectorizer, yielding accuracies exceeding 89% with minimal computational overhead. Meanwhile, the standard DistilBERT model achieved 89.8% accuracy but required over 6 hours of training. By contrast, the LoRA-enhanced DistilBERT attained 90.0% accuracy with a 40% reduction in training time, highlighting the impact of parameter-efficient fine-tuning (PEFT) strategies. These findings underscore the trade-offs between model complexity, computational efficiency, and classification accuracy, and establish LoRA as a practical solution for scalable transformer fine-tuning..

Article activity feed