K-Nearest Neighbors Model to Optimize Data Classification According to the Water Quality Index of the Upper Basin of the City of Huarmey
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Water quality in Peru is an increasing concern, particularly in the upper Huarmey watershed, which is affected by heavy metal contamination and untreated wastewater. This study proposes an automated classification approach using three supervised machine learning algorithms—K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Random Forest (RF)—to assess the water quality based on the Water Quality Index (WQI) of Peru. The experimental results show that KNN outperforms other methods, reaching an accuracy of 95.2%. The proposed system automates and improves the classification accuracy compared with manual methods based on Microsoft Excel. The methodology, performance metrics, dataset characteristics, and geographical context are detailed to ensure replicability. This algorithm assists decision-makers with environmental monitoring and public health protection.