Machine Learning Meets Ecology: XGBoost‑Based Prediction of Endangered Species Refugia Using Multi‑Source Environmental Data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Buxus hyrcana , an endangered and ecologically significant tree species of the Hyrcanian forests, faces severe threats from climate change, land-use pressures, and habitat degradation. Accurate prediction of its potential distribution is therefore critical for conservation and restoration planning. In this study, we applied the eXtreme Gradient Boosting (XGBoost) algorithm to model the distribution of B. hyrcana under three data combinations: (i) WorldClim bioclimatic and topographic variables, (ii) CHELSA bioclimatic and topographic variables, and (iii) WorldClim bioclimatic, topographic, and land-use variables. Model performance was evaluated using AUC, TSS, Kappa, and Accuracy metrics, all of which indicated strong predictive capacity, with the highest performance achieved when land-use data were incorporated. Variable importance analysis revealed a stable set of key predictors—thermal stability (Bio3), annual mean temperature (Bio1), mean temperature of the wettest quarter (Bio8), annual precipitation (Bio12), and slope-related indices (LS Factor)—highlighting the species’ sensitivity to moderate climatic regimes and physiographic constraints. Response curve analysis confirmed that B. hyrcana thrives under moderate temperature and precipitation conditions, while extreme climatic values sharply reduce occurrence probability. Habitat suitability maps consistently identified Mazandaran Province as the most suitable region, with additional restoration potential in Golestan Province. Our findings demonstrate that integrating high-resolution climatic, topographic, and land-use data within advanced machine learning frameworks significantly enhances the accuracy and ecological realism of species distribution models. This study provides a robust methodological framework for predicting the distribution of climate-sensitive, endangered species and offers actionable insights for conservation prioritization and restoration planning in the Hyrcanian forests.