A Data-Driven Machine Learning Clustering of Rainfall Patterns: Is Reclassification of Tanzanian Rainfall Climate Zones Needed?
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The wider accessibility of rainfall data and data-driven analytical approaches has boosted climate regionalization research worldwide, particularly in Tanzania. However, despite substantial research on rainfall trends and machine learning applications in Tanzania, we didn’t find a research study that has systematically analyzed the possibility for reclassification of the country's traditional rainfall climatic zones using data-driven approaches. As a result, this study fills that gap by investigating long-term rainfall variability and determining if the present climatic zoning is still valid under observed rainfall patterns. The study used a quantitative and comparative research design with 40 years of monthly rainfall data (1980-2020), mostly from the GPCC and CHIRPS satellite datasets. Machine learning techniques were used to reclassify zones: k-means, hierarchical clustering, and Partitioning Around Medoids (PAM) algorithms. The study also performed cluster validation and zone's agreement analysis (data-driven vs traditional zones) using silhouette, chi-square tests, Cramer's V, Cohen's Kappa, and the Adjusted Rand Index (ARI). The results show significant interannual rainfall variability among zones, with no statistically significant long-term trends. Agreement analysis, on the other hand, shows that traditional zoning is robust (it agrees with the data-driven clustering), despite small divergence within some transitional zones. The divergence indicates that some zones have internal heterogeneity and there is only a need for intra-zonal reclassification. The study offers an analytical framework for improving climate zoning in Tanzania, enhances geographic precision in agricultural planning, water resource management, and climate adaption measures.