Resource-Efficient Few-Shot Plant Disease Classification via Quantized Low-Rank Adapters in Vision Transformers
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Rapid and accurate detection of plant diseases from limited labeled data remains a critical challenge in digital agriculture. In this paper, we present a few-shot learning framework for plant disease classification that integrates a self-supervised Vision Transformer backbone (DINOv2-S) with a Prototypical Network classifier, adapted via Quantized Low-Rank Adaptation (QLoRA). While QLoRA has demonstrated strong efficiency gains in natural language processing since its introduction, its application to vision-domain few-shot learning tasks has not been systematically explored. Our framework fine-tunes only approximately 1% of the model parameters (~ 221K) using low-rank adapters and 4-bit quantization, achieving an approximate trainable-parameter storage of 2.53 MB and an inference speed of 0.20 ms/image on NVIDIA A100 hardware. Comprehensive experiments on the PlantDoc and PlantVillage datasets show competitive or improved performance relative to recent approaches. In the 5-way 5-shot setting at 224×224 resolution, the proposed framework achieves mean accuracies of 86.85 ± 0.42% and 94.51 ± 0.35%, respectively. Furthermore, ablation studies across varying shot numbers, image resolutions, backbone architectures, LoRA rank configurations, and fine-tuning strategies confirm the robustness and efficiency of the proposed approach. These results suggest that parameter-efficient adaptation of vision transformers offers a practical pathway for deploying disease diagnosis systems in resource-constrained agricultural settings.