Expanding frontiers of complex reaction network exploration through a general reactive machine learning potential

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Developing general machine learning potentials (MLPs) for complex reactive systems remains a fundamental challenge, due to insufficient sampling of critical transition states (TSs) and non-equilibrium structures in training datasets. Here, we introduce an MD/CD-AL framework integrating molecular dynamics/coordinate driving (MD/CD) method with active learning to generate the MDCD20 dataset that encompasses diverse reactive configurations with up to 20 heavy atoms. With this dataset, we develop a general reactive MLP with quantum mechanics (QM) accuracy, MDCD-NN, for H-/C-/N-/O-containing gas-phase reactions. Leveraging MD/CD searching, MDCD-NN enables automated and efficient construction of complex reaction networks with 10 4 -fold acceleration compared to the reference QM method, as demonstrated in three real-world reaction systems featuring intricate regio-/stereo-selectivities. In each case, our approach successfully identifies thousands of intermediates and TSs, constructing multistep networks that rationalize experimental observations and extend established mechanisms. Furthermore, MDCD-NN enables nanosecond-scale dynamics simulations for calculating reactive free energy surface, bridging quantum-level accuracy with scalable simulations. Our framework provides a paradigm for developing general reactive MLPs, enabling high-throughput mechanistic insights for complex chemical systems.

Article activity feed