ProBASS—a language model with sequence and structural features for predicting the effect of mutations on binding affinity

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Motivation

Protein–protein interactions (PPIs) govern virtually all cellular processes, and a single mutation within a PPI can significantly impact protein functionality, potentially leading to diseases. While numerous approaches have emerged to predict changes in the free energy of binding due to mutations (ΔΔGbind), most lack precision. Recently, protein language models (PLMs) have shown powerful predictive capabilities by leveraging both sequence and structural data from protein complexes, yet they have not been optimized specifically for ΔΔGbind prediction.

Results

We developed an approach, ProBASS (Protein Binding Affinity from Structure and Sequence), to predict the effects of mutations on ΔΔGbind using two most advanced PLMs, ESM2 and ESM-IF1, which incorporate sequence and structural features, respectively. We first generated embeddings for each PPI mutant from the two PLMs and then fine-tuned ProBASS by training on a large dataset of experimental ΔΔGbind values. When training and testing were done on the same PPI, ProBASS achieved correlations with experimental ΔΔGbind values of 0.83 ± 0.05 and 0.69 ± 0.04 for single and double mutations, respectively. Additionally, when evaluated on a dataset of 2,325 single mutations across 131 PPIs, ProBASS reached a correlation of 0.81 ± 0.02, substantially outperforming other PLMs in predictive accuracy. Our results demonstrate that refining pre-trained PLMs with extensive ΔΔGbind datasets across multiple PPIs is a successful approach for creating a precise and broadly applicable ΔΔGbind prediction model, facilitating future protein engineering and design studies. ProBASS’s accuracy could be further improved through training as more experimental data becomes available.

Availability and implementation

ProBASS is available at: https://colab.research.google.com/github/sagagugit/ProBASS/blob/main/ProBASS.ipynb.

Article activity feed