Machine Learning and Explainable AI for Agricultural Drought Prediction: A Comparative Analysis of Gradient Boosting Methods Using Multi-Source Earth Observation Data

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Drought monitoring and prediction remain critical challenges in climate science and agricultural management, particularly under accelerating climate change. This study presents a comprehensive machine learning framework for drought susceptibility mapping in Iowa, USA, using multi-source Earth observation data and explainable artificial intelligence. We systematically evaluated eleven supervised learning algorithms including gradient boosting methods (LightGBM, XGBoost, CatBoost), ensemble approaches (Random Forest, Extra Trees), and neural networks for classifying drought severity based on United States Drought Monitor (USDM) categories. The models were trained on 8,200 stratified samples derived from satellite-based vegetation indices (NDVI, EVI, LAI, FPAR, VCI, VHI), land surface temperature metrics (LST, TCI), precipitation data (CHIRPS), soil moisture (SMAP), and land cover information spanning 2015-2021. Performance evaluation using confusion matrices, F1-scores, and ROC-AUC analysis revealed that gradient boosting algorithms significantly outperformed traditional machine learning approaches, with LightGBM achieving the highest accuracy (95%) and macro-averaged F1-score (0.94). SHAP (SHapley Additive exPlanations) interpretability analysis identified precipitation deficits, soil moisture anomalies, and vegetation stress as primary drought drivers, with synergistic interactions between elevated temperature and reduced rainfall amplifying severe drought conditions. Spatial predictions demonstrated climatologically consistent patterns, with elevated drought susceptibility in southwestern Iowa and lower risk in northern riverine corridors. The framework's ability to replicate expert-driven drought classifications while providing mechanistic insights establishes machine learning as a viable complement to traditional drought monitoring systems. These findings contribute to the growing body of climate informatics research and provide a transferable methodology for drought early warning systems in agricultural regions globally.

Article activity feed