Source identification of sudden water pollution events in the Dongliao River using a hybrid AI framework
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This study presents a novel hybrid framework for rapid and robust identification of sudden water pollution sources by integrating machine learning (ML) with numerical modeling, enabling high-precision inversion of source parameters while quantifying their uncertainties. A MIKE21 hydrodynamic-water quality model of the Dongliao River was developed to generate a synthetic dataset, which was used to train and evaluate long short-term memory (LSTM), kernel extreme learning machine (KELM), and support vector machine (SVM) surrogate models. Among them, the LSTM achieved the highest accuracy (R 2 = 0.98, RMSE = 0.03) and was selected for further integration. For deterministic source identification, a whale optimization algorithm (WOA)-LSTM model was developed, reducing the average inversion error to 6.89% (source location error < 3%) and computation time to 233 seconds. A probabilistic inversion system was subsequently established by coupling the WOA-LSTM model with a Bayesian framework, which characterized the posterior probability distributions of source parameters with an average error of 5.26%. To assess robustness, a comparative analysis under a 5% data noise scenario revealed that the probabilistic approach achieved an average relative error of 5.39%, representing a 47.2% improvement over the deterministic method’s 10.22% error. These findings demonstrate that integrating a physics-informed ML surrogate with Bayesian inference effectively addresses uncertainty and computational cost in environmental inverse problems, offering a powerful tool for intelligent early warning and precise management of sudden water pollution incidents.