Evaluating zero-shot prediction of protein design success by AlphaFold, ESMFold, and ProteinMPNN

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

De novo protein design has enabled the creation of proteins with diverse functionalities that are not found in nature. Despite recent advances, experimental success rates remain inconsistent and context-dependent, posing a bottleneck for broader applications of de novo design. To overcome this, structure and sequence prediction models have been applied to assess design quality prior to experimental testing to save time and resources. In this study, we aimed to determine the extent to which AlphaFold2, Protein MPNN, and ESMFold can discriminate between experimentally successful and unsuccessful designs. For this, we curated a benchmark dataset of 614 experimentally characterized de novo designed monomers from 11 different design studies between 2012 and 2021. All predictive models demonstrated moderate ability to discriminate experimental successes (expressed, soluble, monomeric, and fold into the intended design structure) from failures, with many failed designs having better confidence metrics than successful designs. Among all computational models evaluated, ESMFold average pLDDT yielded the best individual performance at distinguishing between successful and unsuccessful designs. A logistic regression model combining all confidence metrics provided only modest improvement over ESMFold pLDDT alone. Overall, these results show that these models can serve as an initial filtering strategy prior to experimental validation; however, their utility at accurately predicting experimental successful designs remains limited without task-specific training.

Article activity feed