Machine Learning vs Langmuir: A Multioutput XGBoost Regressor Better Captures Soil Phosphorus Adsorption Dynamics
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Accurate prediction of soil phosphorus (P) adsorption capacity is essential for efficient fertilizer management and environmental protection. Traditional isotherm models, such as the Langmuir equation, have been widely used to quantify P sorption, but they obviously fail to account for the nonlinear and multivariate nature of soil systems. This study evaluates the performance of a multi-output XGBoost regression model trained on laboratory-measured P adsorption data from 147 soils, representing a wide range of textures, pH levels, and CaCO₃ contents. The model was developed to simultaneously predict P adsorption at five different equilibrium concentrations (1, 2, 4, 6, and 10 mg/L). SHAP analysis and causal discovery via DirectLiNGAM revealed that initial Olsen P concentration and sand content are the primary factors reducing P adsorption. The multi-output XGBoost model was compared against classical Langmuir isotherms using an extended dataset of 10,389 soil samples. The extended dataset, comprising 10,389 rows, was binned into four groups based on Olsen P concentrations and four groups based on sand content. This binning was based on the identification of these variables as highly influential by the XGBoost model, and on their demonstrated causal relationship with soil P sorption capacity through causal inference analysis. The XGBoost model outperformed the Langmuir model in capturing the effect of Olsen P and sand content, as it predicted a 12.6% drop in P adsorption in the very high Olsen P group and a 19.2% drop in the very high sand content groups, which are substantially higher than the reductions estimated by Langmuir isotherms. These results demonstrate that machine learning models, trained on well-designed experimental data, offer a superior alternative to classical isotherms for modeling P sorption dynamics.