Seeing Double: Molecular dynamics simulations reveal the stability of certain alternate protein conformations in crystal structures

Abstract

Proteins jiggle around, adopting ensembles of interchanging conformations. Here we show through a large-scale analysis of the Protein Data Bank and using molecular dynamics simulations, that segments of protein chains can also commonly adopt dual, transiently stable conformations which is not explained by direct interactions. Our analysis highlights how alternate conformations can be maintained as non-interchanging, separated states intrinsic to the protein chain, namely through steric barriers or the adoption of transient secondary structure elements. We further demonstrate that despite the commonality of the phenomenon, current structural ensemble prediction methods fail to capture these bimodal distributions of conformations.

This Zenodo record is a permanently preserved version of a PREreview. You can view the complete PREreview at https://prereview.org/reviews/14254145.

The paper aims to investigate the intrinsic stability of alternate protein conformations that exist independently of external influences. By filtering datasets from the Protein Data Bank (PDB), the authors identify a small number of proteins with segments exhibiting multiple conformations. These proteins were a carefully selected subset, free from confounding factors like ligand interactions, crystal contacts, or environmental effects. They analyzed these proteins using molecular dynamics (MD) simulations to test whether well-separated alternate conformations often remain stable, interconvert, or adopt a distinct (new) conformation. The major success of this paper is identifying cases where these outcomes occur and testing the stability hypotheses with in silico glycine mutations. They also test several new computational approaches and expose limitations in those pipelines in exploring discrete conformational heterogeneity. The major weakness of this paper is the lack of focus on kinetics as governing the observability of interconversion in their molecular dynamics simulations. In general, the paper should better delineatelanguage about relative stability vs. interconversion rates. Only certain relative stabilities will be modelable in X-ray electron density maps; however, there is no kinetic information recoverable and the rates may span many orders of magnitude. By establishing this test set, the paper identifies a nice challenge set for new methods that propose "dynamic" sampling of protein conformation.

Major Points

Clarify the motivation and biological significance of alternate conformations The paper would benefit from a clearer motivation for the importance of studying alternate conformations.
1. On page 2 you mention of how neglecting these conformations can impact structure prediction models like AlphaFold, 'This could bear substantial unintended consequences, for example when such data are used to train structure prediction models such as AlphaFold (Jumper et al. 2021) as we shall discuss below.'
2. In this same paragraph, consider explaining the biological significance of alternate conformations to make your paper more compelling for biologists who lack strong computational backgrounds. For instance, discussing how these conformations affect protein function, or influence protein interactions, could make it easier for readers to understand the implications of conformational heterogeneity beyond computational concerns.
Justify the focus on a specific type of conformational heterogeneity Expanding on why the study focuses specifically on conformational heterogeneity independent of ligands, binding partners, or crystal contacts would enhance clarity.
1. In the introduction you write, 'In this work, we are interested in alternate conformations that do not fall into the above categories'. We believe your reason for doing this is to exclude cases where compositional heterogeneity (i.e. the presence or absence of a bound ligand) influences/impacts the assessment of conformational heterogeneity. However, in order to make this point clearly, the introduction would benefit from a more explicit framing of compositional/conformational diversity. We suggest first explaining this background, and then describing the removal of ligands and binding partners.
2. Similarly, do you remove crystal contacts to be more clearly linked to the idea that the "validation" will be MD without the crystal environment?

Expand on the thermodynamics/kinetics and importance of non-interchanging alternative conformations
1. In the Abstract: Proteins jiggle around, adopting ensembles of interchanging conformations... Our analysis highlights how alternate conformations can be maintained as non-interchanging, separated states intrinsic to the protein chain, namely through steric barriers or the adoption of transient secondary structure elements.
2. Page 2 'Long-lived metastable structures, kinetically trapped in the protein folding funnel, have previously been detected by atomic force microscopy (Shin et al. 2012), NMR (Gautier et al. 2020), and crystallography (Hua et al. 1995).Our results suggest that these phenomena may be more common than previously realized.'
3. Page 3, 'Molecular dynamics (MD) has proven a powerful tool for simulating the conformations adopted by proteins fluctuating within a thermodynamic energy well surrounding a stable minimum.'
4. You mention both interchanging and non-interchanging alternative conformers. What is the thermodynamic/kinetic difference between the two and why do you care more about non-interchanging for the purposes of this paper?
5. 50-500ns is not a very long time on the scale of protein dynamics. Given the timescales of interconversion of similar motions (by NMR relaxation measurements for example) it seems premature to conclude that these proteins do not interconvert - but rather we recommend a more thorough discussion of the sampling limitations given the timescales accessible by MD. It would be helpful to discuss the kinetics of these proteins leveraging any prior knowledge/literature during your analysis.

Minor Points

It could be useful for the reader for you to provide a methods section on how the initial set of structures was curated from the PDB. You mention having "compiled a comprehensive and detailed catalog of alternate backbone segments from the entire PDB," but it's unclear what criteria were used. What search or filtering parameters did you apply (e.g., resolution range, B-factor, or only structures with altlocs)? You state, "Based on the highest resolution structures, we find that over 4% of proteins…" – does this imply that only high-resolution structures were included? Specifying the resolution range would clarify this.
The labeling in Figure 2 could be made clearer. You explain that it shows the abundance of altlocs in PDB structures, but the y-axis labels "Number of chains" and "fraction of chains" are somewhat ambiguous. Consider revising this to "Number/Fraction of chains with altlocs" to specify that these counts refer to chains with alternate conformations. Additionally, the figure legend could benefit from more explanation in the caption. For instance, clarify whether "Altlocs" includes all types of conformational heterogeneity, including those from ligand proximity or crystal contacts, while "No crystal prox." indicates conformational heterogeneity excluding crystal contacts. This would help readers interpret the plot accurately.
The purpose of Supplementary Figure 1 is unclear. The figure demonstrates how altlocs become more clear at higher resolution however, when the figure is referenced, it is just to demonstrate the concept of a multimodal electron density distribution. Either modifying Supplementary Figure 1 to focus on what multimodal electron density distribution means, or the text where Supplementary Figure 1 is referenced to be about altloc behavior at different resolutions would make this section more clear.
The purpose of Figure 2 is unclear. You explain how the 4% of structures are different, but do not explain why the number of the other categories of proteins is relevant. Explaining further why these different types of altlocs are important (in combination with the major point of explaining why altlocs are important in general) would help this Figure fit in with the paper.
The definition of "inseperable" proteins in Figure 3 is unclear. "Failed" proteins encompass proteins that collapse to a third structure and "unstable" proteins encompass proteins that collapse to the A or B structure. It is unclear how there is another way for the A and B states to become inseparable. Specifying what inseparable means and how it is different from "failed" or "unstable" proteins would be helpful.

Competing interests

The authors declare that they have no competing interests.

Read the original source

Seeing Double: Molecular dynamics simulations reveal the stability of certain alternate protein conformations in crystal structures

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Competing interests

Accurate conformational ensembles of mixed folded proteins from NMR-guided simulations

Improving conformational ensembles of folded proteins in GōMartini

Exploring the Conformational Landscape of Adenylate Kinase and Beyond: A Benchmark of Protein Folding Models

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Competing interests

Related articles

Accurate conformational ensembles of mixed folded proteins from NMR-guided simulations

Improving conformational ensembles of folded proteins in GōMartini

Exploring the Conformational Landscape of Adenylate Kinase and Beyond: A Benchmark of Protein Folding Models