SFCalculator: connecting deep generative models and crystallography
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Proteins drive biochemical transformations by transitioning through distinct conformational states. Understanding these states is essential for modulating protein function. Although X-ray crystallography has enabled revolutionary advances in protein structure prediction by machine learning, this connection was made at the level of atomic models, not the underlying data. This lack of connection to crystallographic data limits the potential for further advances in both the accuracy of protein structure prediction and the application of machine learning to experimental structure determination. Here, we present SFCalculator, a differentiable pipeline that generates crystallographic observables from atomistic molecular structures with bulk solvent correction, bridging crystallographic data and neural network-based molecular modeling. We validate SFCalculator against conventional methods and demonstrate its utility by establishing three important proof-of-concept applications. First, SFCalculator enables accurate placement of molecular models relative to crystal lattices (known as phasing). Second, SFCalculator enables the search of the latent space of generative models for conformations that fit crystallographic data and are, therefore, also implicitly constrained by the information encoded by the model. Finally, SFCalculator enables the use of crystallographic data during training of generative models, enabling these models to generate an ensemble of conformations consistent with crystallographic data. SFCalculator, therefore, enables a new generation of analytical paradigms integrating crystallographic data and machine learning.