Efficient Inference, Training, and Fine-tuning of Protein Language Models
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Protein language models have shown great promise in predicting protein structure, function, and the effects of missense variants on protein fitness. However, their use has been limited by the substantial computational resources required. In this work, we focus on improving the computational efficiency of protein language models (PLMs), specifically the Evolutionary Scale Modeling (ESM) family, to increase the accessibility of PLMs. By implementing optimizations such as FlashAttention and Partition-Attention, a novel technique designed to handle proteins of variable length, we achieved a 16-fold speedup in inference time and reduced memory usage by 3 to 14 times for long proteins. Additionally, 4-bit quantization applied to billion-parameter models led to a 2 to 3 times reduction in memory consumption with minimal performance loss in the missense variant effect prediction task. Training efficiency was also improved, with a 6-fold reduction in runtime achieved through activation checkpointing and the DeepSpeed Zero-Offload strategy. For fine-tuning, we employed parameter-efficient methods, enabling state-of-the-art predictions of protein properties and functions by training only the model head or a small fraction of adapter weights. For instance, we achieved a Spearman’s correlation coefficient of 70% in melting point prediction and an 87% area under the precision-recall curve (AU-PRC) for transcription factor prediction. Our efficient ESM (ESME) implementation significantly lowers the barrier to using these powerful models, making them accessible to academic laboratories with limited computational resources. The ESME implementation is available on PyPI ( pypi.org/project/esm-efficient ) and GitHub ( github.com/uci-cbcl/esm-efficient ).