Uncertainty-Aware ML Surrogates for DFT Strain–Bandgap Engineering in CsSnX3 (X = Cl, Br, I)
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Strain is a powerful and reversible knob to tune the electronic structure of halide perovskites, but brute-force first-principles mapping of wide compressive/tensile windows is costly and can be numerically fragile near metallization. We develop uncertainty-aware machine-learning surrogates for the strain–band-gap relation 𝐸𝑔 (𝜀) in cubic CsSnX3 (X = Cl, Br, I) under isotropic and uniaxial-𝑐 loading. A curated, convergence-checked DFT dataset (PBE, ultrasoft pseudopotentials, 600 eV cutoff, 12×12×12 𝑘-mesh; SOC omitted by design) is used to train per-curve models selected via leave-one-out cross-validation across linear, kernel, tree, instance-based, and Gaussian-process families. Kernel surrogates dominate: Gaussian processes and kernel ridge achieve state-of-the-art accuracy, including a LOO RMSE ∼0.025 eV for CsSnBr3 under uniaxial-𝑐. A single “hard” case, CsSnCl3 under isotropic compression-shows local non-convexity near an incipient metallic window, for which a 𝑘NN model outperforms smooth kernels and the predictive uncertainty widens appropriately. Calibrated intervals (analytic 𝜎 for GPR; bootstrap 𝜎 otherwise) support principled, goal-optional acquisition of next-DFT points, concentrating effort where information gain is highest. Deformation-potential analysis at 𝜀=0% yields two robust trends: isotropic loading produces larger |𝑎𝑔| than uniaxial-𝑐 for all halides, and the isotropic sensitivity follows Cl > Br > I. The resulting surrogates provide fast, interpretable, and confidence-quantified maps for strain–band-gap engineering and a reproducible playbook to direct high-value calculations. All data, models, and scripts are released to enable reuse.