Construction of a Stage-aware Water Quality Prediction Model Driven by the Temporal Evolution of Key Factors

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Conventional unified modeling strategies often overlook the stage-specific variability in pollution source contributions, which may lead to biased identification of key factors and diminished predictive performance. To address this limitation, this study develops a stage-awaremodeling strategy based on stage-specific response mechanisms. Using the Daluxi River, a primary tributary of the upper Yangtze River, as a case study, the year was divided into two distinct periods based on combined Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA): the low-temperature nutrient-retention period (November to February) and the multi-source disturbance period (March to October). Feature selection for each stage was conducted through a combination of Hierarchical Clustering of Pearson Correlation coefficients and Support Vector Regression with Recursive Feature Elimination (SVR-RFE), followed by model construction with Long Short-Term Memory (LSTM) networks. Furthermore, Generalized Additive Models (GAM) combined with SHAP (SHapley Additive exPlanations) were employed to elucidate the nonlinear response mechanisms of variables across stages. The results show that the R² values of COD Mn , TN, and TP reached 0.962/0.928, 0.951/0.996, and 0.713/0.906 in different stages, with notable reductions in MSE and MAPE, confirming the superiority of stage-specific over full-period modeling. For AN, although R² was relatively low (0.202/0.826), it exceeded the full-period result (0.518), with near-zero MSE and low MAPE (8.35%/3.71%). This strategy aligns with the stage-specific characteristics of pollution evolution, enhancing the scientific rigor and interpretability of the model, and provides a new approach for transitioning water environment management from "static averaging" to "dynamic identification and period-specific control."

Article activity feed