Reliable Identification of Homodimers Using AlphaFold

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Motivation: Protein-protein interactions are central for understanding biological processes. The ability to predict interaction partners is extremely valuable for avoiding costly, time-consuming experiments. It has been shown that AlphaFold has an unsurpassed ability to accurately evaluate interacting protein pairs. However, a protein can also form homomeric interactions, i.e. interact with itself. Results: We found that AlphaFold yielded a significantly higher false-positive rate for identifying homodimers than for heterodimers. True Positive Rate (TPR) at 1% False Positive Rate (FPR) drops from 63% for heterodimers to 18% for homodimers. When we investigated the high-scoring false positives, i.e., non-homodimers with high AlphaFold scores when predicted as such, we found that their homologs were enriched for homomultimeric proteins. Using a simple logistic regression model that combines AlphaFold scores with structural and homology information, we increased the TPR (at 1% FPR) to 42 +/- 8% (5-fold cross-validation) from 19%. If we excluded the homology information, we achieved a TPR of 28 +/- 7%, which is still better than using AlphaFold metrics. Availability and implementation: All data are available from Zenodo DOI:\10.5281/zenodo.17738668 and all code from https://github.com/SarahND97/alphafold-homodimers

Article activity feed