StrucNS reveals interaction-weighted network topology as the driving predictor of absolute stability of natural and de novo proteins
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Motivation
Folded protein function requires stability, yet mapping structure and sequence to a fitness landscape remains difficult. The protein fold is the physical realization of complex, spatially-sensitive physicochemical interactions among residues; quantitatively elucidating how these subtle relationships dictate thermodynamic stability remains challenging. We present StrucNS, a mathematical framework that identifies principles governing protein fitness by employing network science to learn physicochemical relationships between residues and their stability contributions directly from the protein fold. Representing the fold as a network topology, we utilize an inverse approach: while the fold is traditionally viewed as the phenotypic consequence of the underlying chemical forces, we use the topology to decode the very physicochemical dependencies that govern protein stability. Unlike protein language models reliant on high-dimensional evolutionary embeddings, StrucNS extracts these signals directly from the interaction-weighted network topology. Independence from evolutionary history uniquely suits StrucNS for de novo design prediction.
Results
Despite reduced dimensionality and training depth, StrucNS outperforms ESM-2 and ProteinMPNN on predicting mutational stability. StrucNS outperforms supervised UniRep in predicting absolute stability of de novo designs. Feature analysis reveals network topology as the key driver of predictive power, contributing 59% of model importance. SHAP analysis reveals two highly influential features as high degree and low modularity of polar/hydrophobic mixed subnetworks, which highlights the importance of connectivity between the hydrophobic core and protein surface to drive stability contrary to the conventional focus on the hydrophobic core. Revelation of predictive topological features underscores the utility of an interpretable model.
Availability
Source codes are available: https://github.com/Hackel-Group-CEMS/StrucNS .