Explainable ML for Climate Hotspot Identification]{Explainable Machine Learning for Climate Change Hotspot Identification: Spatial Generalization Testing Across Sindh Province, Pakistan

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Machine learning (ML) continues to be utilized in climate-change research, and much of the analysis overestimates model performance, as it does not include spatial dependence. This methodological inadequacy creates the misleading impression that models generalize well to new locations, when in fact they often fail outside the training domain. We evaluate eight ML predictors of tempera- ture anomalies in 22 districts of Sindh, Pakistan on 44 years of observations. We performed spatial gener- alization on a Leave-One-District-Out Cross-Validation (LODO-CV) and tested this generalization to completely unfamiliar locations. Gradient Boosting was the most successful algorithm with ( R 2 = 0.914±0.098 ) when predicting the temperature anomaly in areas that were not included in the training, which indicates a strong transferability to the wide range of climatic areas across the region. SHAP feature attribution showed that climate variables (37.6%), temporal trends (32.0%), and anthropogenic proxies (23.7%), are the most important predictors, although it is also important to note the caveat that the importance of proxies is only indicative of correlation, not causation, and must be carefully considered when applying to policy matters. Part of depen- dence analysis estimated a negative dependence of vegetation-temperature of −0.15 o C per 0.1 NDVI of vegetation increase indicating that vegetation preser- vation and restoration measures may provide cooling advantages. By using a dual-index model, which integrates the frequency of extreme events with aver- age climate changes, we were able to pinpoint seven hotspots of climate change, concentrated in Karachi and Hyderabad urban areas, which are exposed to com- pound risk of urbanization, coastal exposure, and rising temperature extremes. The results indicate the urgent need of spatially explicit validation procedures when using climate ML and offer practical suggestions to specific adaptation planning to the most climate-prone districts in Pakistan.

Article activity feed