K-Nearest Neighbors Model to Optimize Data Classification According to the Water Quality Index of the Upper Basin of the City of Huarmey

Hugo Vega-Huerta
Jean Pajuelo-Leon
Percy De-la-Cruz-VdV
David Calderón
Gisella Luisa Elena Maquen-Niño
Milton E. Rios-Castillo
Adegundo Camara-Figueroa
Rubén Gil-Calvo
Luis Guerra-Grados
Oscar Benito-Pacheco

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Water quality in Peru is an increasing concern, particularly in the upper Huarmey watershed, which is affected by heavy metal contamination and untreated wastewater. This study proposes an automated classification approach using three supervised machine learning algorithms—K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Random Forest (RF)—to assess the water quality based on the Water Quality Index (WQI) of Peru. The experimental results show that KNN outperforms other methods, reaching an accuracy of 95.2%. The proposed system automates and improves the classification accuracy compared with manual methods based on Microsoft Excel. The methodology, performance metrics, dataset characteristics, and geographical context are detailed to ensure replicability. This algorithm assists decision-makers with environmental monitoring and public health protection.

Version published to 10.3390/app151810202
Sep 19, 2025
Version published to 10.20944/preprints202505.0112.v1
May 5, 2025

A novel multivariate approach for water quality index prediction irrespective of the geospatial distinction of water sources having varied end uses

This article has 6 authors:
1. Himanchal Bhardwaj
2. Rajkumar Satankar
3. Deeptha Giridharan
4. Deepika Bhattu
5. Venkata Ravibabu Mandla
6. Anand Krishnan Plappally
This article has no evaluationsLatest version Jan 14, 2026
Prediction of Air Pollutants based on Time-Weighted Ensemble Model and Adaptive Air Quality Index Model

This article has 5 authors:
1. Borui Wang
2. Chenjie Gong
3. Chao Liu
4. Jiahe Yang
5. Huili Huang
This article has no evaluationsLatest version Jan 16, 2026
Comparative evaluation of the performance of nine Machine Learning models for predicting corn yield based on uncalibrated empirical data in Cameroon

This article has 4 authors:
1. Sopkoutie Kengni Nerlus Gautier
2. Bidias Aboh Francis
3. Deffo Tchouan Gilchrist
4. Kamseu Mogo Jean Paul
This article has no evaluationsLatest version Jan 19, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

A novel multivariate approach for water quality index prediction irrespective of the geospatial distinction of water sources having varied end uses

Prediction of Air Pollutants based on Time-Weighted Ensemble Model and Adaptive Air Quality Index Model

Comparative evaluation of the performance of nine Machine Learning models for predicting corn yield based on uncalibrated empirical data in Cameroon