ESM-Effect: An Effective and Efficient Fine-Tuning Framework towards accurate prediction of Mutation’s Functional Effect
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (Arcadia Science)
Abstract
Predicting functional properties of mutations like the change in enzyme activity remains challenging and is not well captured by traditional pathogenicity prediction. Yet such functional predictions are crucial in areas like targeted cancer therapy where some drugs may only be administered if a mutation causes an increase in enzyme activity. Current approaches either leverage static Protein-Language Model (PLM) embeddings or complex multi-modal features (e.g., static PLM embeddings, structure, and evolutionary data) and either (1) fall short in accuracy or (2) involve complex data processing and pre-training. Standardized datasets and metrics for robust benchmarking would benefit model development but do not yet exist for functional effect prediction.
To address these challenges we develop ESM-Effect, an optimized PLM-based functional effect prediction framework through extensive ablation studies. ESM-Effect fine-tunes ESM2 PLM with an inductive bias regression head to achieve state-of-the-art performance. It surpasses the multi-modal state-of-the-art method PreMode, indicating redundancy of structural and evolutionary features, while training 6.7-times faster.
In addition, we develop a benchmarking framework with robust test datasets and strategies, and propose a novel metric for prediction accuracy termed relative Bin-Mean Error (rBME): rBME emphasizes prediction accuracy in challenging, non-clustered, and rare gain-of-function regions and correlates more intuitively with model performance than commonly used Spearman’s rho. Finally, we demonstrate partial generalization of ESM-Effect to unseen mutational regions within the same protein, illustrating its potential in precision medicine applications. Extending this generalization across different proteins remains a promising direction for future research. ESM-Effect is available at: https://github.com/moritzgls/ESM-Effect .
Article activity feed
-
Even when only the first layer (immediately after the embedding layer) is unfrozen, it can still influence the subsequent layers, enabling the model to produce informative embeddings for the regression head at the final layer
This is fascinating! I wonder how the perplexity of the pre-training task is affected by which layer you choose to unfreeze.
-
The ESM-Effect Architecture thus comprises the 35M ESM2 model with 10 of 12 layers frozen and the mutation position regression head (cf. Figure 2). The model’s performance is driven by two key inductive biases in the regression head:
How would you extend this combined head architecture (mutation position embedding + mean pooled) if you were looking at the effect of a multi-mutation variant?
One strategy I can think of would be to slice out all mutation positions and pool them. I'm wondering if you guys thought about generalizing the architecture to scenarios when the number of mutations in your DMS dataset varies.
-
Figure 1:
Seeing the pre-training validation perplexity in (1) made me wonder: did you ever assess how fine-tuning affected post-training perplexity? This could be a proxy to gauge how disruptive fine-tuning is for the pre-training task.
-