Remotely Sensed Precipitation Estimates Using Hyrbid Machine Learning Models in a Monsoon-Dominated Climate

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Precise precipitation estimation is vital for effective water resource management, disaster planning, and climate adaptation, particularly in data-deficient regions like a monsoon-dominated sub-tropical country including Bangladesh. However, conventional approaches relying on rain gauge networks or meteorological models face challenges such as sparse spatial coverage and high installation and maintenance costs. To fill this gap, this study proposes a high-tech hybrid machine learning model for estimating rainfall using remote sensing (RS) datasets (CHIRPS, PERSIANN, and ERA5) and cutting-edge algorithms with gradient-enhanced bias correction. The model combines XGBoost, linear regression (LR), and K-Nearest Neighbors (KNN) in a stacked ensemble setup. The gradient-boosting-based bias correction used monthly rainfall data from nine locations in Bangladesh (1990–2019) to fix common RS issues, like seasonal shift detection and peak rainfall underestimation. The meta-model outperformed individual ML models (LR, RF, XGBoost, KNN), with R² values consistently above 0.75. Combining all three RS datasets improved performance (R² = 0.9) compared to using two (R² >0.87) or one (R² >0.8). The bias correction process substantially enhanced predictive accuracy across geographical locations. Post-correction, R² increased from 6.5–13.1%, RMSE decreased by 22–60.1%, and MAE reduced from 49.6–71.4%, underscoring the effectiveness of the bias correction. It also minimized pre-monsoon and monsoon inaccuracies, increasing robustness. The model achieved a median R² of 0.95, RMSE of 25 mm, and MAE of 15 mm. Overall, the hybrid meta-model outperformed all individual ML models in predicting rainfall from RS datasets, with bias correction significantly enhancing performance across contexts. This study is the first to compare various ML models with the proposed meta-model while integrating multiple RS datasets to improve accuracy. The model’s limited ability to capture localized precipitation in complex terrains and short timeframes highlights the need for incorporating more climatic variables and advanced neural networks to improve accuracy and scalability.

Article activity feed