Predicting pose distribution of protein domains connected by flexible linkers is an unsolved problem
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
In CASP16, we assessed the ability of computational methods to predict the distribution of relative orientations of two domains tethered by a flexible linker. The range of interdomain distances and orientations (poses) of such domain-linker-domain (D-L-D) proteins can play an important role in protein function, allostery, aggregation, and the thermodynamics of binding. The CASP16 Conformational Ensembles Experiment included two challenges to predict the interdomain pose distribution of a Staphylococcal protein A (SpA) D-L-D construct, called ZLBT-C, in which two of SpA’s five nearly identical domains are connected by either (1) a six-residue wild-type (WT) linker (kadnkf), or (2) an all-glycine (Gly6) linker. The wild-type linker has a highly conserved sequence and is thought to contribute to the energetic barrier for binding with host antibodies. Ground truth was provided by nuclear magnetic resonance (NMR) residual dipolar coupling (RDC) data on WT protein and small angle X-ray scattering (SAXS) data on both proteins in solution. Twenty-five predictor groups submitted 35 sets of predicted conformational distributions, in the form of population-weighted finite ensembles of discrete structures. Unlike traditional CASP assessments that compare predicted atomic models to experimental atomic models, the accuracy of these predictions was assessed by back-calculating NMR RDCs and SAXS curves from each ensemble of atomic models and comparing these results to respective experimental data. Accuracy was also assessed by using kernelization to compare ensembles to the continuous orientational distributions optimally fit to experimental data. In our assessment, predictions spanned a wide range of accuracy, but none were close fits to the combined NMR and SAXS data. In addition, none were able to recapitulate the observed difference between WT and Gly6 proteins, as observed in the SAXS data. These results, and our analysis, highlighted strengths and weaknesses, plus complementarity of NMR RDC and SAXS analysis.