StoPred: Accurate Stoichiometry Prediction for Protein Complexes Using Protein Language Models and Graph Attention

Quancheng Liu
Chunxiang Peng
Wei Zheng
Chengxin Zhang
Lydia Freddolino

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Proteins often function as part of complexes, and the specific stoichiometry of these assemblies is critical for their biological roles, but experimental determination of assembly composition remains challenging and existing computational methods for stoichiometry prediction are limited. Existing approaches rely on template-based searches or require predefined stoichiometry for structure prediction, hampering their applicability to proteins without close homologs or known assembly states. Recent advances using protein language models (pLM) have enabled sequence-based prediction of homo-oligomer stoichiometry, but these methods are not applicable to hetero-oligomeric complexes and do not fully leverage inter-subunit relationships. Here, we present StoPred, a method that predicts the stoichiometry of protein complexes by integrating pLM embeddings with a graph attention network to model subunit-level interactions. StoPred infers stoichiometry directly from sequence or structure features for both homo-and hetero-oligomers, without requiring template assemblies or predefined composition. We benchmark StoPred against deep learning-based and template-based methods, and show that it achieves improved accuracy and efficiency across curated and blind datasets, with up to 16% and 41% higher top-1 accuracy for homomeric and heteromeric complexes, respectively, compared to the strongest prior method on our held-out test dataset. More importantly, StoPred is the first deep learning-based method capable of accurately predicting the stoichiometry of hetero-oligomeric complexes.

Version published to 10.1101/2025.10.20.683515 on bioRxiv
Oct 21, 2025

StoPred: Accurate Stoichiometry Prediction for Protein Complexes Using Protein Language Models and Graph Attention

This article has 5 authors:
1. Lydia Freddolino
2. Quancheng Liu
3. Chunxiang Peng
4. Wei Zheng
5. Chengxin Zhang
This article has no evaluationsLatest version Nov 5, 2025
BindPred: A Framework for Predicting Protein-Protein Binding Affinity from Language Model Embeddings

This article has 4 authors:
1. Haixing Piao
2. Veda Sheersh Boorla
3. Somtirtha Santra
4. Costas D. Maranas
This article has no evaluationsLatest version Sep 29, 2025
ProtAttn-QuadNet: An attention-based deep learning framework for protein–protein interaction prediction using ProtBERT embeddings

This article has 3 authors:
1. Md. Shahidul Islam
2. Md. Muhtasim Rahman Mim
3. Md. Raihan Kabir
This article has no evaluationsLatest version Nov 17, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

StoPred: Accurate Stoichiometry Prediction for Protein Complexes Using Protein Language Models and Graph Attention

BindPred: A Framework for Predicting Protein-Protein Binding Affinity from Language Model Embeddings

ProtAttn-QuadNet: An attention-based deep learning framework for protein–protein interaction prediction using ProtBERT embeddings