A Framework for Interpretable Machine Learning in Hydrologic Thermal Modeling
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Understanding and modeling Reservoir Water Temperature (RWT) is critical for sustainable water management, ecosystem health, and climate resilience. However, conventional predictive models often act as black boxes, offering limited physical interpretability. This study presents a framework for interpretable machine learning in hydrologic thermal modeling, integrating explainable ML with symbolic representation to uncover the physical drivers of RWT dynamics. Using over 10,000 depth-resolved temperature profiles from ten reservoirs in the U.S. Red River Basin, we demonstrate how data-driven models can be transformed into transparent, physics-consistent surrogates. Ensemble and neural models, Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Multilayer Perceptron (MLP), achieved strong predictive skill (best RMSE = 1.20 °C, R² = 0.97). SHAP (SHapley Additive exPlanations) analysis quantified the influence of key drivers such as air temperature, depth, wind, and lake volume, revealing consistent nonlinear dependencies. Building on these insights, Kolmogorov–Arnold Networks (KANs) were used to derive symbolic expressions that preserve interpretability while capturing nonlinearity. Ten progressively complex KAN equations improved from R² = 0.84 with a single predictor to R² = 0.92 with ten, with marginal gains beyond five variables, illustrating a balance between model parsimony and accuracy. The framework couples predictive performance with mechanistic understanding, demonstrating that interpretable ML and symbolic KAN formulations can advance hydrologic thermal modeling beyond prediction toward explanation, generalization, and process discovery.