Mapping the diverse topologies of protein-protein interaction fitness landscapes

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

De novo binder discovery is unpredictable and inefficient due to a lack of quantitative understanding of protein-protein interaction (PPI) sequence-function landscapes. Here, we use our PANCS-Binder technology to perform >1,300 independent selections of various library sizes and compositions of a randomized small protein to identify binders to a panel of 96 distinct target proteins. For successful selections, we discovered reproducible fitness landscapes that group into a few, target-specific, clusters. Each cluster defines a minimal binding motif whose frequency is inversely proportional to the number of specified amino acids (∼2–8) and determines selection success, which is quantifiable by the density of binders to the target within a theoretical sequence space. We leverage these data to develop a supervised contrastive learning approach that discriminates binders from non-binders and demonstrates generalization beyond a threshold amount of data. Together, this framework renders PPI landscapes measurable and predictive, accelerating de novo binder discovery and optimization.

Article activity feed