Reliable prediction of protein–protein binding affinity changes upon mutations with Pythia-PPI

Fangting Tao
Jinyuan Sun
Pengyue Gao
George Fu Gao
Bian Wu

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Protein–protein interactions (PPIs) are essential for numerous biological functions and predicting binding affinity changes caused by mutations is crucial for understanding the impact of genetic variation and advancing protein engineering. Although machine-learning-based methods show promise in improving prediction accuracy, limited experimental data remain a significant bottleneck. In this study, we employed multitask learning and self-distillation to overcome the data limitation and improve the accuracy of protein–protein binding affinity prediction. By incorporating a mutation stability prediction task, our model achieved state-of-the-art accuracy on the SKEMPI dataset and was subsequently used to predict binding affinity changes for millions of mutations, generating an expanded dataset for self-distillation. Compared with prevalent methods, Pythia-PPI increased the Pearson's correlation between predictions and experimental data from 0.6447 to 0.7850 on the SKEMPI dataset and from 0.3654 to 0.6050 on the viral-receptor dataset. Experimental validation further confirmed its ability to identify high-affinity mutations on the CB6 antibody in complex with the severe acute respiratory syndrome coronavirus 2 prototype receptor binding domain, with the best single-point mutant among the top 10 predictions showing a 2-fold increase in binding affinity. These findings demonstrate that Pythia-PPI is a valuable tool for analysing the fitness landscape of PPIs. A web server for Pythia-PPI is available at https://pythiappi.wulab.xyz for easy access.

Version published to 10.1093/nsr/nwaf231
May 5, 2025
Version published to 10.1101/2024.10.28.620752 on bioRxiv
Nov 3, 2024

The Evolution of the AlphaFold Architecture

This article has 1 author:
1. Y.C.B.J. Dissanayaka
This article has no evaluationsLatest version Jan 9, 2026
Feature-Optimized Machine Learning Benchmarking for Protein Interface Prediction in Permanent Homodimer Complexes with Distinct Structural Features

This article has 4 authors:
1. Tayyip Topuz
2. Zeki Erdem
3. Halil Bisgin
4. E. Demet Akten
This article has no evaluationsLatest version Feb 2, 2026
A Survey on Efficient Protein Language Models

This article has 8 authors:
1. Shouren Wang
2. Debargha Ganguly
3. Vinooth Kulkarni
4. Wang Yang
5. Zhuoran Qiao
6. Daniel Blankenberg
7. Vipin Chaudhary
8. Xiaotian Han
This article has no evaluationsLatest version Dec 24, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

The Evolution of the AlphaFold Architecture

Feature-Optimized Machine Learning Benchmarking for Protein Interface Prediction in Permanent Homodimer Complexes with Distinct Structural Features

A Survey on Efficient Protein Language Models