ESM-Effect: An Effective and Efficient Fine-Tuning Framework towards accurate prediction of Mutation’s Functional Effect

Moritz Glaser
Johannes Brägelmann

This article has been Reviewed by the following groups

Read the full article

Listed in

Evaluated articles (Arcadia Science)

Abstract

Predicting functional properties of mutations like the change in enzyme activity remains challenging and is not well captured by traditional pathogenicity prediction. Yet such functional predictions are crucial in areas like targeted cancer therapy where some drugs may only be administered if a mutation causes an increase in enzyme activity. Current approaches either leverage static Protein-Language Model (PLM) embeddings or complex multi-modal features (e.g., static PLM embeddings, structure, and evolutionary data) and either (1) fall short in accuracy or (2) involve complex data processing and pre-training. Standardized datasets and metrics for robust benchmarking would benefit model development but do not yet exist for functional effect prediction.

To address these challenges we develop ESM-Effect, an optimized PLM-based functional effect prediction framework through extensive ablation studies. ESM-Effect fine-tunes ESM2 PLM with an inductive bias regression head to achieve state-of-the-art performance. It surpasses the multi-modal state-of-the-art method PreMode, indicating redundancy of structural and evolutionary features, while training 6.7-times faster.

In addition, we develop a benchmarking framework with robust test datasets and strategies, and propose a novel metric for prediction accuracy termed relative Bin-Mean Error (rBME): rBME emphasizes prediction accuracy in challenging, non-clustered, and rare gain-of-function regions and correlates more intuitively with model performance than commonly used Spearman’s rho. Finally, we demonstrate partial generalization of ESM-Effect to unseen mutational regions within the same protein, illustrating its potential in precision medicine applications. Extending this generalization across different proteins remains a promising direction for future research. ESM-Effect is available at: https://github.com/moritzgls/ESM-Effect .

Arcadia Science
Apr 1, 2025

Even when only the first layer (immediately after the embedding layer) is unfrozen, it can still influence the subsequent layers, enabling the model to produce informative embeddings for the regression head at the final layer

This is fascinating! I wonder how the perplexity of the pre-training task is affected by which layer you choose to unfreeze.

Read the original source
Arcadia Science
Apr 1, 2025

The ESM-Effect Architecture thus comprises the 35M ESM2 model with 10 of 12 layers frozen and the mutation position regression head (cf. Figure 2). The model’s performance is driven by two key inductive biases in the regression head:

How would you extend this combined head architecture (mutation position embedding + mean pooled) if you were looking at the effect of a multi-mutation variant?

One strategy I can think of would be to slice out all mutation positions and pool them. I'm wondering if you guys thought about generalizing the architecture to scenarios when the number of mutations in your DMS dataset varies.

Read the original source
Arcadia Science
Apr 1, 2025

Figure 1:

Seeing the pre-training validation perplexity in (1) made me wonder: did you ever assess how fine-tuning affected post-training perplexity? This could be a proxy to gauge how disruptive fine-tuning is for the pre-training task.

Read the original source
Version published to 10.1101/2025.02.03.635741v1 on bioRxiv
Feb 7, 2025

Learning sequence to predict gain- or loss-of-function variants

This article has 5 authors:
1. Doyeon Ha
2. Sungnam Kim
3. Kisang Kwon
4. Wonseok Chung
5. Joohyun Han
This article has no evaluationsLatest version Jun 6, 2025
Benchmarking DNA Foundation Models for zero-shot variant effect prediction: the role of context, training, and architecture

This article has 4 authors:
1. Ilaria Alfisi
2. Francesca Ciapi
3. Marta Baragli
4. Alberto Magi
This article has no evaluationsLatest version Jun 19, 2025
Gene-embedding-based prediction and functional evaluation of perturbation expression responses with PRESAGE

This article has 14 authors:
1. Russell Littman
2. Jacob Levine
3. Sepideh Maleki
4. Yongju Lee
5. Vladimir Ermakov
6. Lin Qiu
7. Alexander Wu
8. Kexin Huang
9. Romain Lopez
10. Gabriele Scalia
11. Tommaso Biancalani
12. David Richmond
13. Aviv Regev
14. Jan-Christian Hütter
This article has no evaluationsLatest version Jun 6, 2025

This article has been Reviewed by the following groups

Listed in

Abstract

Article activity feed

Related articles

Learning sequence to predict gain- or loss-of-function variants

Benchmarking DNA Foundation Models for zero-shot variant effect prediction: the role of context, training, and architecture

Gene-embedding-based prediction and functional evaluation of perturbation expression responses with PRESAGE