Machine learning-based prediction of water quality indices to improve drinking water treatment operations: A case study in Ecuador
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Ensuring access to safe drinking water remains a pressing global challenge, particularly in regions subject to seasonal hydrological variability and anthropogenic pressures. Water Quality Indices (WQI) provide a synthetic metric to condense multiple physicochemical parameters into a decisionoriented tool for treatment plant operations. This study applies, for the first time in Ecuador, a multi-horizon machine learning framework to predict the WQI of raw water at the intake of a drinking water treatment plant in Portoviejo, Manabí province. Using six daily-measured predictors (pH, turbidity, color, electrical conductivity, total dissolved solids, and total hardness), we implemented an M5P regression tree model with random hyperparameter search and conservative temporal cross-validation. The model achieved stable cross-validated performance across horizons (1–15 days), with R2 ranging from 0.962 (1-day) to 0.926 (15-day), and MAE between 1.3 and 2.5 WQI units, sufficient to discriminate category transitions. Complementary GLM–ANOVA analysis revealed significant contributions from pH and color, followed by ionic load variables, aligning with operational treatment processes. The integration of explainable machine learning, statistical significance testing, and interpretable forecasting strengthens traceability and regulatory compliance under national and WHO guidelines. Results demonstrate that parsimonious models based on plant data can anticipate WQI dynamics with actionable lead time, supporting proactive chemical dosing, filter management, and resilience against extreme events. The proposed framework is transferable to other Latin American contexts with seasonal hydrology, bridging predictive analytics and operational decision-making in alignment with Sustainable Development Goal SDG 6.