Deep Evolutionary Fitness Inference for Variant Nomination from Directed Evolution
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Iterative screening techniques, such as directed evolution, enable high-throughput affinity maturation to optimize binders to molecular interfaces. However, the decision problem of selecting variants from rich, evolved populations to enter low-throughput follow-up methods remains a significant bottleneck. Here, we present evolutionary fitness inference (EVFI) and DeepEVFI, two machine learning methods that model directed evolution from time-series sequencing data, and infer fitness, a variant’s ability to enrich under selection pressure. Our methods flexibly handle mutation mechanisms and starting populations that may be partially unknown – settings relevant to drug discovery – and achieve strong performance on a diverse set of experimental data. We conducted two experimental directed evolution campaigns, using antibodies and macrocyclic peptides libraries to identify and optimize binders to therapeutically relevant targets. EVFI and DeepEVFI identified tighter binders that were missed by human experts using conventional frequency-based approaches, including “rising stars” with low frequency. Beyond initial hit discovery, EVFI and Deep-EVFI enables labeling large-scale sequence-fitness datasets and identifying variants of initial binders with diverse properties.