ProStab: Prediction of protein stability change upon mutations by protein language and inverse folding models

Hong Tan
Xiaowei Wei
Shenggeng Lin
Xueying Mao
Junwei Chen
Heqi Sun
Yufang Zhang
Zhenghong Zhou
Dong-Qing Wei
Shuangjun Lin
Yi Xiong

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Predicting protein stability change upon mutation is critical for protein engineering, yet remains limited by the modeling assumptions of physics-based methods and the generalization bottlenecks of data-driven approaches. We present ProStab, a deep learning framework that integrates sequence- and structure-based information, including the mutation-aware sequence embeddings from protein language models and the geometric features extracted via an inverse folding model. Trained on the large-scale Megascale dataset, ProStab demonstrates strong performance across diverse test sets and robust generalization across distribution shifts between the training and test sets. In head-to-head comparisons, ProStab outperforms all state-of-the-art methods with consistently higher Spearman correlation and precision. To evaluate its practical utility, we experimentally validated ProStab-predicted mutations on the model enzyme transaminase. Among the 16 successfully expressed variants, 4 exhibited improved thermal stability. Remarkably, the 1st top-ranked predicted mutation yielded the highest observed enzymatic activity, retaining three-fold that of the wild type after 10 minutes at 40 °C. To facilitate broader application, a publicly accessible web server has been developed. We envisage that ProStab provides a scalable and accurate platform for intelligent protein stability design.

Version published to 10.1101/2025.08.11.669595 on bioRxiv
Aug 15, 2025

Protein Language Models Rescue Variant Pathogenicity Prediction in Intrinsically Disordered Regions Through Synergistic Integration with Structure-Based Methods

This article has 1 author:
1. Hayden Farquhar
This article has no evaluationsLatest version Feb 4, 2026
Quantum-Assisted Refinement of AlphaFold Protein Structures

This article has 1 author:
1. Parham Ghayour
This article has no evaluationsLatest version Dec 31, 2025
A Survey on Efficient Protein Language Models

This article has 8 authors:
1. Shouren Wang
2. Debargha Ganguly
3. Vinooth Kulkarni
4. Wang Yang
5. Zhuoran Qiao
6. Daniel Blankenberg
7. Vipin Chaudhary
8. Xiaotian Han
This article has no evaluationsLatest version Dec 24, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Protein Language Models Rescue Variant Pathogenicity Prediction in Intrinsically Disordered Regions Through Synergistic Integration with Structure-Based Methods

Quantum-Assisted Refinement of AlphaFold Protein Structures

A Survey on Efficient Protein Language Models