Macon: Enhance Protein Mutation Representation using Contrastive Learning with Effect Prediction on Protein–protein Interactions

Weihao Li
Zhe Liu
Guan Ning Lin

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Mutations in protein sequences can significantly alter protein-protein interactions (PPIs), leading to diverse functional outcomes relevant to disease mechanisms and therapeutic targeting. While existing computational approaches predominantly estimate changes in binding free energy in PPIs, they often failed to capture categorical effects such as complete disruption of interaction or gain of interaction. Categorical models like MIPPI addresses this by predicting mutation effects into functional classes, yet their reliance on one-hot encoding limits the ability to capture detailed sequence information. Here, we propose Macon, a two-stage deep-learning framework that integrates contrastive pretraining and protein language model (pLM) embeddings to enhance mutation-sensitive sequence representation. In the first stage, Macon leverages contrastive learning to distinguish wild-type and mutant sequences in a context-independent manner; in the second, it integrates both contrastive embeddings and pLM-derived features to perform multi-class classification of PPI mutation effects. Evaluated on a curated IMEx dataset with 10,119 annotated single-point mutations, Macon achieves state-of-the-art performance with an overall accuracy of 0.73, outperforming baseline methods including MIPPI and embedding-only classifiers. Our results highlight the benefit of contrastive representation learning in capturing subtle mutational impacts and demonstrate Macon’s utility as a robust and generalizable tool for functional variant interpretation in protein interaction networks.

Version published to 10.21203/rs.3.rs-7469880/v1 on Research Square
Oct 9, 2025

Protein Language Models Rescue Variant Pathogenicity Prediction in Intrinsically Disordered Regions Through Synergistic Integration with Structure-Based Methods

This article has 1 author:
1. Hayden Farquhar
This article has no evaluationsLatest version Feb 4, 2026
A Survey on Efficient Protein Language Models

This article has 8 authors:
1. Shouren Wang
2. Debargha Ganguly
3. Vinooth Kulkarni
4. Wang Yang
5. Zhuoran Qiao
6. Daniel Blankenberg
7. Vipin Chaudhary
8. Xiaotian Han
This article has no evaluationsLatest version Dec 24, 2025
Integrating Evolutionary and Compositional Features with ML and DL for Robust and Interpretable Druggable Protein Prediction

This article has 5 authors:
1. Mujeebu Rehman
2. Qinghua Liu
3. Muhammad Javed
4. Ali Ghulam
5. Teerath Kumar
This article has no evaluationsLatest version Dec 11, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Protein Language Models Rescue Variant Pathogenicity Prediction in Intrinsically Disordered Regions Through Synergistic Integration with Structure-Based Methods

A Survey on Efficient Protein Language Models

Integrating Evolutionary and Compositional Features with ML and DL for Robust and Interpretable Druggable Protein Prediction