K-Nearest Neighbors Model to Optimize Data Classification According to the Water Quality Index of the Upper Basin of the City of Huarmey

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Water quality in Peru is an increasing concern, particularly in the upper Huarmey watershed, which is affected by heavy metal contamination and untreated wastewater. This study proposes an automated classification approach using three supervised machine learning algorithms—K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Random Forest (RF)—to assess the water quality based on the Water Quality Index (WQI) of Peru. The experimental results show that KNN outperforms other methods, reaching an accuracy of 95.2%. The proposed system automates and improves the classification accuracy compared with manual methods based on Microsoft Excel. The methodology, performance metrics, dataset characteristics, and geographical context are detailed to ensure replicability. This algorithm assists decision-makers with environmental monitoring and public health protection.

Article activity feed