The Spatial Distribution and Driving Mechanism of Soil Organic Matter in Hilly Basin Areas Based on Genetic Algorithm Variable Combination Optimization and SHAP Interpretation
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Studying the spatial variation patterns and influencing factors of soil organic matter (SOM) in hilly and basin areas is of great significance for guiding agricultural production practices. This study takes Lanxi City as an example and comprehensively considers soil formation factors such as climate, vegetation, and terrain. Based on the genetic algorithm, 47 environmental variables are combined and optimized to construct a random forest (RF) model and an improved version—a random forest model based on genetic algorithm variable combination optimization (RF-GA). At the same time, the SHAP interpretation method is used to quantitatively analyze the spatial distribution characteristics of the SOM content and further identify the main driving factors. Compared with the ordinary Kriging (OK) and random forest (RF) methods, the random forest model (RF-GA) based on genetic algorithm variable combination optimization demonstrates a significantly improved prediction accuracy (R² = 0.49; RMSE = 3.49 g·kg⁻¹), with an MAE = 3.019 and LCCC = 0.67. Among the three models, the R² of the RF-GA model increases by 87.84% and 56.29%. The model prediction results indicate that the SOM content in the study area ranges from 12.11 to 31.38 g · kg ⁻¹, showing spatial distribution characteristics of a higher content in mountainous areas and a lower content in plains. A further SHAP analysis shows that terrain, climate, and biological factors are key environmental factors affecting the spatial differentiation of the SOM, with the CNBL and DEM playing particularly significant roles. By regulating moisture, erosion deposition, vegetation distribution, and microclimate conditions, they significantly affect the spatial distribution of the SOM. In summary, the RF-GA and its interpretable prediction model constructed in this study not only effectively reveal the spatial and driving mechanisms of SOM in hilly and basin areas but also provide a solid theoretical basis and practical guidance for accurate mapping, the formulation of sustainable utilization strategies for soil resources, and ensuring national food security.