Interpretable Machine Learning Framework for Geochemical Classification: Advancing Mineral and Geothermal Resource Assessment
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Accurate lithological classification from geochemical data is fundamental to quantitative resource exploration, evaluation, and risk reduction. This study develops an explainable ensemble learning framework that integrates Random Forest, XGBoost, CatBoost, and Multi-Layer Perceptron models to classify 3,868 igneous rock samples using major oxide compositions. The CatBoost model achieved the highest performance with 89.9% accuracy and 85.7% F1-macro score, outperforming other optimized models. Explainability analysis using SHAP (SHapley Additive exPlanations) quantitatively validated model outputs against petrological theory: SiO2 emerged as the dominant discriminator (importance: 1.026), followed by CaO and MgO, accurately reflecting magmatic differentiation processes. The framework integrates prediction confidence to quantify geological uncertainty in resource assessment contexts. This approach enhances efficiency in mineral and geothermal resource evaluation by enabling rapid, interpretable geochemical classification that supports subsurface mapping and reduces exploration uncertainty. With sub-second inference times, the framework provides operational feasibility for field deployment in exploration programs. By bridging machine learning outputs with geological understanding, this work advances quantitative resource geoscience through transparent, high-accuracy classification suitable for mineral prospectivity mapping, geothermal reservoir characterization, and exploration risk assessment.