Hydrological modeling in a highly urbanized watershed using explainable machine learning and sub-hourly data: A case study in the city of Sao Paulo, Brazil
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Hydrological modeling of urbanized watersheds is a highly challenging task due to the complexity and non-linearity of the rainfall-runoff relationship in these areas. Many data-driven models have been proposed in the literature to address this problem. However, in this field, there is a need not only for performance but also for explainability and comprehension of the impacts of hydrometeorological factors. This study proposes a detailed comparative analysis between ensemble machine learning models using an explainable framework. We explore feature engineering and feature selection techniques to determine the best set of predictors in a situation of non-continuous data, a common problem in real-world scenarios. Among the models analysed, CatBoost stood out as the best-performing algorithm for most cases, and, in general, all the ensemble algorithms achieved good performance for a forecasting horizon up to 3 hours. A study with SHAP values revealed insightful aspects of the spatial and temporal dynamics of the rainfall-runoff relationship.