Atom level enzyme active site scaffolding using RFdiffusion2
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
De novo enzyme design starts from ideal active site descriptions consisting of constellations of catalytic residue functional groups around reaction transition state(s), and seeks to generate protein structures that can accurately hold the site in place. Highly active enzymes have been designed starting from such descriptions using the generative AI method RFdiffusion [1–3], but there are two current methodological limitations. First, the geometry of the active site can only be specified at the residue level, so for each catalytic residue functional group placed around the reaction transition state, the possible locations of the residue backbone must be enumerated by building side chain rotamers back from the functional group. Second, the location of the catalytic residues along the sequence must be specified in advance, which considerably limits the space of solutions which can be sampled. Here we describe a new deep generative method, Rosetta Fold diffusion 2 (RFdiffusion2), that solves both problems, enabling enzymes to be designed from sequence agnostic descriptions of functional group locations without inverse rotamer generation. We first evaluate RFdiffusion2 on an in silico enzyme design benchmark of 41 diverse active sites and find that it is able to successfully build proteins scaffolding all 41 sites, compared to 16/41 with prior state-of-the-art deep learning methods. Next, we design enzymes around three diverse catalytic sites and characterize the designs experimentally; in each case we identify active catalysts in testing less than 96 sequences. RFdiffusion2 demonstrates the potential of atomic resolution generative models for the design of de novo enzymes directly from their reaction mechanisms.