Macon: Enhance Protein Mutation Representation using Contrastive Learning with Effect Prediction on Protein–protein Interactions
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Mutations in protein sequences can significantly alter protein-protein interactions (PPIs), leading to diverse functional outcomes relevant to disease mechanisms and therapeutic targeting. While existing computational approaches predominantly estimate changes in binding free energy in PPIs, they often failed to capture categorical effects such as complete disruption of interaction or gain of interaction. Categorical models like MIPPI addresses this by predicting mutation effects into functional classes, yet their reliance on one-hot encoding limits the ability to capture detailed sequence information. Here, we propose Macon, a two-stage deep-learning framework that integrates contrastive pretraining and protein language model (pLM) embeddings to enhance mutation-sensitive sequence representation. In the first stage, Macon leverages contrastive learning to distinguish wild-type and mutant sequences in a context-independent manner; in the second, it integrates both contrastive embeddings and pLM-derived features to perform multi-class classification of PPI mutation effects. Evaluated on a curated IMEx dataset with 10,119 annotated single-point mutations, Macon achieves state-of-the-art performance with an overall accuracy of 0.73, outperforming baseline methods including MIPPI and embedding-only classifiers. Our results highlight the benefit of contrastive representation learning in capturing subtle mutational impacts and demonstrate Macon’s utility as a robust and generalizable tool for functional variant interpretation in protein interaction networks.