Accurate prediction of protein stability changes from single mutations using self-distillation and antisymmetric constraint strategies

Wenkang Wang
Yihang Zhou
Xiaoqiang Huang
Yifan Wu
Min Li
Yang Zhang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Computational approaches for accurately predicting protein stability changes upon residue mutations are crucial for protein engineering and design. Sequence-based methods are easier to apply to large-scale proteins since they do not rely on high-quantity structures. However, existing sequence-based approaches struggle to capture structural changes, resulting in lower performance compared to structure-based methods. In this study, we propose DPStab, a sequence-based deep learning solution that accurately predicts protein stability changes upon single residue mutations. DPStab transfers a protein large language model as a core component and incorporates a cross-attention mechanism to capture the contact changes around mutated positions for ΔΔ G and Δ T _m prediction. To address data imbalance and the antisymmetric nature of mutation effects, DPStab employs a self-distillation inference strategy under the supervision of an antisymmetric constraint. Benchmarking demonstrates that DPStab achieves state-of-the-art performance in both ΔΔ G and Δ T _m prediction. Practical evaluations confirm DPStab’s capability in accurately ranking protein stability on large-scale datasets and effectively identifying critical structural contacts impacting stability. More experiments on extensive cDNA display proteolysis data demonstrate the significant contributions of self-distillation and antisymmetric constraint strategies.

Significance Statement

Single amino acid mutations significantly influence protein stability, thereby affecting biological function and potential therapeutic uses. Accurately predicting how mutations affect protein stability is fundamental to protein engineering and therapeutic design. However, current sequence-based computational methods fail to capture the structural context changes around mutated residues. To overcome this, we propose DPStab, a sequence-based deep learning approach that combines a protein language model and a cross-attention mechanism with self-distillation and antisymmetric strategies. DPStab effectively captures residue contact changes and predicts stability changes without structural data. Sufficient experiments demonstrate that DPStab significantly outperforms existing methods, providing a fast and practical tool for enhancing protein engineering and biomedical research.

Version published to 10.1101/2025.05.18.654422 on bioRxiv
May 23, 2025

Quantum-Assisted Refinement of AlphaFold Protein Structures

This article has 1 author:
1. Parham Ghayour
This article has no evaluationsLatest version Dec 31, 2025
The Evolution of the AlphaFold Architecture

This article has 1 author:
1. Y.C.B.J. Dissanayaka
This article has no evaluationsLatest version Jan 9, 2026
A Survey on Efficient Protein Language Models

This article has 8 authors:
1. Shouren Wang
2. Debargha Ganguly
3. Vinooth Kulkarni
4. Wang Yang
5. Zhuoran Qiao
6. Daniel Blankenberg
7. Vipin Chaudhary
8. Xiaotian Han
This article has no evaluationsLatest version Dec 24, 2025

Discuss this preprint

Listed in

Abstract

Significance Statement

Article activity feed

Related articles

Quantum-Assisted Refinement of AlphaFold Protein Structures

The Evolution of the AlphaFold Architecture

A Survey on Efficient Protein Language Models