Improving K-Means Clustering: A Comparative Study of Parallelized Version of Modified K-Means Algorithm for Clustering of Satellite Images

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Efficient clustering of high-dimensional satellite image datasets remains a critical challenge, particularly due to the computational demands of spectral distance calculations, random centroid initialization, and sensitivity to outliers in conventional K-Means algorithms. This study presents a comprehensive comparative analysis of eight parallelized variants of the K-Means algorithm, designed to enhance clustering efficiency and reduce computational burden for large-scale satellite image analysis. The proposed parallelized implementations incorporate optimized centroid initialization for better starting point selection using a dynamic K-Means sharp method to detect the outlier to improve cluster robustness, and a Nearest-Neighbor Iteration Calculation Reduction method to minimize redundant computations. These enhancements were applied to a test set of 114 global land cover data cubes, each comprising high-dimensional satellite images of size 3712*3712*16, and executed on multi-core CPU architecture to leverage extensive parallel processing capabilities. Performance was evaluated across three criteria: convergence speed (iterations), computational efficiency (execution time), and clustering accuracy (RMSE). The Parallelized Enhanced K-Means (PEKM) method achieved the fastest convergence at 234 iterations and the lowest execution time of 4230 hours, while maintaining consistent RMSE values (0.0136) across all algorithm variants. These results demonstrate that targeted algorithmic optimizations, combined with effective parallelization strategies, can improve the practicality of K-Means clustering for high dimensional satellites image analysis. This work underscores the potential of improving K-Means clustering frameworks beyond hardware acceleration alone, offering scalable solutions good for large-scale unsupervised image classification tasks.

Article activity feed