Toward Accurate RNA Folding Thermodynamics: Evaluation of Enhanced Sampling Methods for Force Field Benchmarking

Abstract

Biologically functional RNAs operate near marginal stability, and their rugged free-energy landscapes and profound structural dynamics – typically not captured by structural biology experiments – play decisive roles. Atomistic molecular dynamics (MD) simulations provide a unique means to characterize these features. However, the applicability of atomistic MD is currently limited by accessible simulation timescales and, most importantly, by force-field (FF) accuracy. Folding free energies (ΔG° _fold ) of small RNA motifs represent well-defined targets for quantitative benchmarking of RNA FFs. In practice, however, obtaining thermodynamic estimates that are sufficiently robust for direct comparison with experimental data remains highly challenging, even for small RNA systems, and many published studies rely on sampling that is not fully converged. Here, we systematically assess the performance of widely used advanced enhanced-sampling techniques using the 8-mer r(gcGAGAgc) tetraloop as a representative benchmark system. We test temperature replica exchange (T-REMD), two solute-tempering variants of replica exchange (REST2 and REHT), as well as well-tempered metadynamics and on-the-fly probability enhanced sampling combined with solute tempering (ST-MetaD and ST-OPES). Among the tested approaches, T-REMD proves to be the most robust, yielding reproducible folding equilibria and consistent estimates of ΔG° _fold after approximately 20 μs of simulation time, independent of the initial folded or unfolded conformational ensemble. Our results provide practical guidelines for selecting sampling protocols suitable for quantitative RNA benchmarks and lay the foundation for systematic validation and future refinement of RNA FFs.

This Zenodo record is a permanently preserved version of a PREreview. You can view the complete PREreview at https://prereview.org/reviews/18475062.

Preprint

Toward Accurate RNA Folding Thermodynamics: Evaluation of Enhanced Sampling Methods for Force Field BenchmarkingPetra Kührová, Vojtěch Mlýnský, Michal Otyepka, Jiří Šponer, Pavel Banáš https://doi.org/10.64898/2026.01.19.700441

Authors of the review

Giovanni Bussi and Salvatore Di Marco.

This report was written after a journal club given by Giovanni Bussi in the bussilab group meeting. All the members of the group, including external guests, are acknowledged for participating in the discussion and providing feedback that was useful to prepare this report.

The corresponding authors of the original manuscript were consulted before posting this report.

Summary

This manuscript reports an extensive computational study of a RNA hairpin loop with sequence gcGAGAgc, which is simulated using a recent variant of the AMBER force field with hydrogen bond corrections. The same identical system is simulated using a number of enhanced sampling techniques so as to provide a fair comparison on a system relevant for the study of RNA structural changes including loop formation and base pairing. The manuscript concludes that T-REMD is the most reliable approach.

Comments

The explanation of the replica ladders used in the different approaches is not completely clear. For instance, for the REHT method, it is not clear if one replica exists with unscaled Hamiltonian and T=300K. We believe that adding tables in SI reporting the list of simulation temperature and scaling factor for each simulation, and a note on which of the replicas was analyzed to report populations of the folded state, would be very useful
We are not sure about the interpretation of Figure 4D. Based on the text, we expect the populations of the 30 microsecond MetaD simulations to be 24% and 32%. However, these numbers do not seem compatible with the rightmost points of the red and yellow lines in Figure 4D.
The comparison between methods might be slightly unfair given the different computational efforts required. One can extrapolate from the figures that the initial part of the T-REMD simulation displays more transitions than the whole REST2 trajectories. However, a comment highlighting how the comparison should be made for simulations of different lengths could help the reader. The same comment applies to the comparison of OPES+REST2, for which relatively short trajectories are shown, with T-REMD ("Nevertheless, in the case of the 8-mer GAGA TL, ST-OPES still performs worse than T-REMD simulations in terms of statistical uncertainty.").
MetaD was coupled with REST2. This is a quite standard protocol for the group. However, in the past MetaD was proposed to be coupled with T-REMD (Bottaro et al, JPCL 2016). Without running a new simulation, could the authors comment on what's their expectation in the performance of MetaD+T-REMD on this system?
In a previous paper (Mlynsky et al, JCTC 2022), by analysing simulations on the same system, the authors concluded that: (a) REST2 does not converge in 120 microseconds per replica (compatible with this paper) and (b) MetaD+REST2 improves convergence by inducing folding transitions. Is the result consistent with the current manuscript? Maybe visualising transitions in continuous trajectories for the MetaD+REST2 simulations in the current manuscript could help. An explicit sentence commenting that the previous results are compatible with the current paper or outdated could also make the new manuscript more consistent with the existing literature. In general, we believe that plots showing transitions in the continuous replicas could help for both MetaD+REST2 and for OPES+REST2
Writing remarks:
- we would suggest to mention in Figures 2 and 3 that trajectories are "continuous" (or "demultiplexed"), because it might be not very clear if someone reads the captions without reading the text first.
- Figs. 1C and 3E could be more easily comparable with the same y-limits.

Competing interests

The authors declare that they have no competing interests.

Use of Artificial Intelligence (AI)

The authors declare that they did not use generative AI to come up with new ideas for their review.

Read the original source

Toward Accurate RNA Folding Thermodynamics: Evaluation of Enhanced Sampling Methods for Force Field Benchmarking

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Preprint

Authors of the review

Summary

Preprint

Authors of the review

Summary

Comments

Competing interests

Use of Artificial Intelligence (AI)

Are Energy and Forces Really Enough? Using Structure to Evaluate the Accuracy and Transferability of Machine Learning Potentials of Biomolecules

Comprehensive benchmarking of RNA velocity methods across single-cell datasets

Discovery of β-Sheet Peptide Assembly Codes via an Experimentally Validated Predictive Computational Platform

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Preprint

Authors of the review

Summary

Preprint

Authors of the review

Summary

Comments

Competing interests

Use of Artificial Intelligence (AI)

Related articles

Are Energy and Forces Really Enough? Using Structure to Evaluate the Accuracy and Transferability of Machine Learning Potentials of Biomolecules

Comprehensive benchmarking of RNA velocity methods across single-cell datasets

Discovery of β-Sheet Peptide Assembly Codes via an Experimentally Validated Predictive Computational Platform