Soil Moisture Retrieval Based on Ensemble Learning Models Using Landsat8 Data in Areas of High Heterogeneity

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Soil moisture (SM) is intricately connected to various components of the Earth's system and plays a critical role in human survival and development. Although significant efforts have been made in soil moisture retrieval and validation, the monitoring and assessment of high-precision, high-resolution SM networks remain limited. This study focuses on a long-term SM network (QLB-NET) located in a high-altitude plateau with significant topographic variability. Utilizing more than 20 soil moisture-related indices derived from Landsat data, along with elevation and its derivatives, we estimate soil moisture at a 30-meter resolution with the aid of ensemble Learning Models. Soil moisture is treated as the dependent variable, while the other data serve as independent variables, integrated into a dataset which is subsequently divided into training, validation, and testing sets. Four ensemble Learning Models—Random Forest (RF), Extremely Randomized Trees (ERT), Extreme Gradient Boosting (XGBoost), and Categorical Boosting (CatBoost)—were evaluated for their performance in soil moisture retrieval. The CatBoost model demonstrated the best performance, surpassing the other models across the training, validation, and test sets. In the test set, it achieved a correlation coefficient (R² = 0.83), a root mean square error (RMSE) of 0.052 m³/m³, a bias of 0.003 m³/m³, and a mean square error (MSE) of 0.003. We also used the four models to generate 30 m soil moisture maps for three different dates, providing more detailed insights into the spatial distribution of soil moisture. SHAP was employed to assess the contribution of different features to soil moisture predictions, revealing that elevation had the greatest impact. Finally, we assessed the overall heterogeneity of QLB-NET, using terrain complexity to represent local heterogeneity. Our findings indicate that the northern-central region exhibits significant local heterogeneity. Moreover, areas with higher heterogeneity also show greater uncertainty, making them a key source of model prediction error. The findings of this study contribute to more accurate retrieval of soil moisture, enhancing both new and existing methods.

Article activity feed