Evaluation of spatial effects on confounding hydrogeochemical interactions in nitrate pollution using the SPAMAXAC method based on Machine Learning and Explainable Artificial Intelligence
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Nitrate contamination of groundwater is unsafe to humans and ecosystem sustainability, demanding effective initiatives focused on predictive frameworks for its monitoring and mitigation. In this study, we attempted to integrate machine learning and explainable artificial intelligence to predict nitrate levels and to highlight the spatial effects on hydrogeochemical interactions. We assessed machine learning models with and without spatial components for the years 2019 (19 models) and 2023 (19 models), respectively. LightGradient Boosting Machine and Linear Regression model provided relatively better accuracy with an R 2 value ranging from 0.44 to 0.51, reflecting the intricacies that these models captured among the hydrogeochemical features. We find that the SHAP provided inflated R 2 values (up to 28%) compared to machine learning derived values, raising concerns with XAI utility in groundwater studies. We observed that the chloride and bicarbonates remained dominant predictors of nitrate when the spatial effect was considered. Without spatial effects, we find that phosphate and pH are dominant predictors of nitrate (2019) and that chloride and sodium are dominant predictors (2023). We proposed a SPAMAXAC method that interweaves the Spatial effect, ML, SHAP, and correlation to address the problem to a relatively reasonable extent. We conclude that the use of ML integrated with XAI for groundwater studies holds immense potential, but caution should be ensured concerning overstated interpretations.