In-Silico Stability Predictors: Investigation of Performance Towards balanced Experimental Data

Kristine Degn
Mattia Utichi
Pablo Sánchez-Izquierdo Besora
Matteo Tiberti
Elena Papaleo

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Change in protein stability, quantified as the change in Gibbs free energy of folding (ΔΔG) in kcal/mol, plays a crucial role in functional alterations of proteins, with misfolding and destabilization commonly associated with pathogenicity. The past two decades have brought the development of bioinformatics tools leveraging evolutionary knowledge, empirical force fields, and machine learning to predict stability alterations. However, existing tools are often optimized towards or trained on limited experimental data, leading to unbalanced datasets and potential overfitting. The research objective is to benchmark selected stability predictors using an unbiased and balanced dataset, with AlphaFold structures as input. We demonstrate a performance decline when balancing data across amino acids, stabilizing and destabilizing mutations, and protein representatives, highlighting that redundancy alone is insufficient for benchmarking correction. Additionally, we illustrate that a protein structure ensemble from molecular dynamics acts as a superior input compared to a single static structure. At the same time, coarse-grained methodologies tend to decrease the output quality.

Version published to 10.1101/2025.03.28.645695 on bioRxiv
Apr 2, 2025

The Evolution of the AlphaFold Architecture

This article has 1 author:
1. Y.C.B.J. Dissanayaka
This article has no evaluationsLatest version Jan 9, 2026
Feature-Optimized Machine Learning Benchmarking for Protein Interface Prediction in Permanent Homodimer Complexes with Distinct Structural Features

This article has 4 authors:
1. Tayyip Topuz
2. Zeki Erdem
3. Halil Bisgin
4. E. Demet Akten
This article has no evaluationsLatest version Feb 2, 2026
Protein Language Models Rescue Variant Pathogenicity Prediction in Intrinsically Disordered Regions Through Synergistic Integration with Structure-Based Methods

This article has 1 author:
1. Hayden Farquhar
This article has no evaluationsLatest version Feb 4, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

The Evolution of the AlphaFold Architecture

Feature-Optimized Machine Learning Benchmarking for Protein Interface Prediction in Permanent Homodimer Complexes with Distinct Structural Features

Protein Language Models Rescue Variant Pathogenicity Prediction in Intrinsically Disordered Regions Through Synergistic Integration with Structure-Based Methods