Temporal Transferability of Satellite Rainfall Bias Correction Methods in a Data-Limited Tropical Basin
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The Philippines experiences intense rainfall but has limited ground-based monitoring infrastructure for flood prediction. Satellite rainfall products provide broad coverage but contain systematic biases that reduce operational usefulness. This study evaluated whether three correction methods—Quantile Mapping (QM), Random Forest (RF), and Hybrid Ensemble—maintain accuracy when applied to future periods with substantially different rainfall characteristics. Using the Cagayan de Oro River Basin in Northern Mindanao as a case study, models were trained on 2019–2020 data and tested on an independent 2021 period exhibiting 120% higher mean rainfall and 33% increased rainy-day frequency. During training, Random Forest and Hybrid Ensemble substantially outperformed Quantile Mapping (R2 = 0.71 and 0.76 versus R2 = 0.25 for QM). However, when tested under realistic operational constraints using seasonally incomplete calibration data (January–April only), performance rankings reversed completely. Quantile Mapping maintained operational reliability (R2 = 0.53, RMSE = 5.23 mm), while Random Forest and Hybrid Ensemble failed dramatically (R2 dropping to 0.46 and 0.41, respectively). This demonstrates that training accuracy poorly predicts operational reliability under changing rainfall regimes. Quantile Mapping’s percentile-based correction naturally adapts when rainfall patterns shift without requiring recalibration, while machine learning methods learned magnitude-specific patterns that failed when conditions changed. For flood early warning in data-limited basins with equipment failures and variable rainfall, only Quantile Mapping proved operationally reliable. This has practical implications for disaster risk reduction across the Philippines and similar tropical regions where standard validation approaches may systematically mislead model selection by measuring calibration performance rather than operational transferability.