A Survey on Efficient Protein Language Models

Shouren Wang
Debargha Ganguly
Vinooth Kulkarni
Wang Yang
Zhuoran Qiao
Daniel Blankenberg
Vipin Chaudhary
Xiaotian Han

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Protein language models (pLMs) have become indispensable tools in computational biology, driving advances in variant effect prediction, functional annotation, structure prediction, and engineering. However, their rapid expansion from millions to tens of billions of parameters introduces significant computational, accessibility, and sustainability challenges that limit practical application in environments constrained by GPU memory, hardware availability, and energy budgets. This survey presents the first comprehensive review of efficient pLMs, synthesizing recent advancements across four key dimensions. We first examine (1) dataset efficiency through meta-learning-based few-shot and scaling-law-guided data allocation; and (2) architecture efficiency via lightweight alternatives including quantized transformers, embedding compression, and convolution-based designs. Furthermore, we review (3) training efficiency through scaling-law-informed pretraining, structure-integrated multimodal approaches, and low-rank adaptations with diverse distillation strategies; and (4) inference efficiency via quantization, dense-retrieval, and structure-search methods. By providing a structured taxonomy and practical guidance, this survey enables the development of high-performance, scalable, yet sustainable next-generation pLMs.

Version published to 10.20944/preprints202512.2131.v1
Dec 24, 2025

In-Context Learning in Genomic Language Models as a Biological Evaluation Task

This article has 2 authors:
1. Aadit Kapoor
2. Wendy Lee
This article has no evaluationsLatest version Dec 9, 2025
Quantum-Assisted Refinement of AlphaFold Protein Structures

This article has 1 author:
1. Parham Ghayour
This article has no evaluationsLatest version Dec 31, 2025
Emergence of Biological Structural Discovery in General-Purpose Language Models

This article has 1 author:
1. Liang Wang
This article has no evaluationsLatest version Jan 8, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

In-Context Learning in Genomic Language Models as a Biological Evaluation Task

Quantum-Assisted Refinement of AlphaFold Protein Structures

Emergence of Biological Structural Discovery in General-Purpose Language Models