A Framework for Interpretable Machine Learning in Hydrologic Thermal Modeling

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Understanding and modeling Reservoir Water Temperature (RWT) is critical for sustainable water management, ecosystem health, and climate resilience. However, conventional predictive models often act as black boxes, offering limited physical interpretability. This study presents a framework for interpretable machine learning in hydrologic thermal modeling, integrating explainable ML with symbolic representation to uncover the physical drivers of RWT dynamics. Using over 10,000 depth-resolved temperature profiles from ten reservoirs in the U.S. Red River Basin, we demonstrate how data-driven models can be transformed into transparent, physics-consistent surrogates. Ensemble and neural models, Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Multilayer Perceptron (MLP), achieved strong predictive skill (best RMSE = 1.20 °C, R² = 0.97). SHAP (SHapley Additive exPlanations) analysis quantified the influence of key drivers such as air temperature, depth, wind, and lake volume, revealing consistent nonlinear dependencies. Building on these insights, Kolmogorov–Arnold Networks (KANs) were used to derive symbolic expressions that preserve interpretability while capturing nonlinearity. Ten progressively complex KAN equations improved from R² = 0.84 with a single predictor to R² = 0.92 with ten, with marginal gains beyond five variables, illustrating a balance between model parsimony and accuracy. The framework couples predictive performance with mechanistic understanding, demonstrating that interpretable ML and symbolic KAN formulations can advance hydrologic thermal modeling beyond prediction toward explanation, generalization, and process discovery.

Article activity feed