Machine Learning-Based Identification of Key Predictors for Lightning Events in the Third Pole Region
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The Third Pole region, particularly the Hindu-Kush-Himalaya (HKH), is highly prone to lightning, causing thousands of fatalities annually. Skillful prediction and timely communication are essential for mitigating lightning-related losses in such observationally data-sparse regions. Therefore, this study evaluates kilometer-scale ICON-CLM-simulated atmospheric variables using six machine learning (ML) models to detect lightning activity over the Third Pole. Results from the ensemble boosting ML models show that ICON-CLM simulated variables such as relative humidity (RH), vorticity (vor), 2m temperature (t_2m), surface pressure (sfc_pres), among a total of 25 variables, allow better spatial and temporal prediction of lightning activities, achieving a Probability of Detection (POD) of ∼ 0.65. The Lightning Potential Index (LPI) and the product of convective available potential energy (CAPE) and precipitation (prec_con), referred to as CP (i.e., CP = CAPE × precipitation), serve as key physics aware predictors, maintaining a high Probability of Detection (POD) of ∼ 0.62 with a 1–2 hour lead time. Sensitivity analyses additionally using climatological lightning data showed that while ML models maintain comparable accuracy and POD, climatology primarily supports broad spatial patterns rather than fine-scale prediction improvements. As LPI and CP reflect cloud microphysics and atmospheric stability, their inclusion, along with spatiotemporal averaging and climatology, offers slightly lower, yet comparable, predictive skill to that achieved by aggregating 25 atmospheric predictors. Finally, model evaluation using the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) highlights XGBoost as the best-performing diagnostic classification (yes/no lightning) model across all six ML tested configurations.