OpenCSP: A Deep Learning Framework for Crystal Structure Prediction from Ambient to High Pressure

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

High-pressure crystal structure prediction (CSP) underpins advances in condensed matter physics, planetary science, and materials discovery. Yet, most large atomistic models are trained on near-ambient, equilibrium data, leading to degraded stress accuracy at tens to hundreds of gigapascalsand sparse coverage of pressure-stabilized stoichiometries and dense coordination motifs. Here,we introduce OpenCSP, a machine learning framework for CSP tasks spanning ambient to high-pressure conditions. This framework comprises an open-source pressure-resolved dataset alongsidea suite of publicly available atomistic models that are jointly optimized for accuracy in energy, force, and stress predictions. The dataset is constructed via randomized high-pressure sampling anditeratively refined through an uncertainty-guided concurrent learning strategy, which enriches under-represented compression regimes while suppressing redundant DFT labeling. Despite employing atraining corpus one to two orders of magnitude smaller than those of leading large models, OpenCSP achieves comparable or superior performance in high-pressure enthalpy ranking and stability prediction. Across benchmark CSP tasks spanning a wide pressure window, our models match or surpass MACE-MPA-0, MatterSim v1 5M, and GRACE-2L-OAM, with the largest gains observed at elevated pressures. These results demonstrate that targeted, pressure-aware data acquisition coupledwith scalable architectures enables data-efficient, high-fidelity CSP, paving the way for autonomous materials discovery under ambient and extreme conditions.

Article activity feed