A Machine Learning-Based Quality Control Algorithm for Heavy Rainfall Using Multi-Source Data

Hao Sun
Qing Zhou
Lijuan Shi
Cuina Li
Shiguang Qin
Dan Yao
Mingyi Xu
Yang Huang
Qin Hu
Yunong Guan

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

In this study, a machine learning-based quality control algorithm for heavy rainfall was developed by integrating automatic weather station observations with remote sensing data, minute-level data, and metadata. Based on heavy rainfall samples from 1 June 2022 to 31 December 2024, the performances of four gradient boosting models (XGBoost, LightGBM, CatBoost, and GBRT) significantly outperformed conventional method, with XGBoost in particular achieving an increase in precision by 0.110, recall by 0.162, and F1-score by 0.140. This performance gain is attributed to the models’ ability to effectively learn nonlinear features from complex multi-source data, thereby reducing both false alarms and missed detections of anomalous rainfall events. The radar composite reflectivity, satellite cloud-top temperature, and minute-level precipitation were identified as dominant contributors to model predictions. The integration of multi-sensor observations effectively addressed limitations inherent in conventional threshold-based approaches. Through SHAP-based interpretability analysis, the model’s decision logic was shown to align with meteorological physical principles. Characteristic patterns such as combinations of low radar reflectivity and elevated cloud-top temperatures were flagged as anomalous rainfall events, typically corresponding to manual operational errors. Moreover, the model identified anomalous minute-level precipitation extremes to be critical signals for detecting instrument malfunctions, data encoding and transmission errors. The physical consistency of the model’s reasoning enhances its trustworthiness and supports its potential for operational implementation in heavy rainfall quality control.

Version published to 10.20944/preprints202510.1276.v1
Oct 16, 2025

Remotely Sensed Precipitation Estimates Using Hyrbid Machine Learning Models in a Monsoon-Dominated Climate

This article has 4 authors:
1. Swaranjit Roy
2. Md. Helal Ahmmed
3. Susmith Kundu
4. Abu Reza Md. Towfiqul I
This article has no evaluationsLatest version Sep 22, 2025
Flood Prediction with Artificial Intelligence An Exploratory Data Analysis Approach

This article has 7 authors:
1. Arya Vithal Mane
2. Rashmi Ravindra Halkarni
3. Pallavi Mahesh Bhat
4. Amarnath Mahesh Kakatikar
5. Rajkumar Raikar
6. Rajashri Khanai
7. Salma Shamashoddin Shahapur
This article has no evaluationsLatest version Sep 17, 2025
Employing Artificial Intelligence to Predict δ¹⁸O and δ²H Isotope Ratios in Precipitation in Iraq under Changing Climate Patterns

This article has 6 authors:
1. Ali Al Maliki
2. Ali Al-Naji
3. Ahmed Al Lami
4. Haitham A. Afan
5. Maryam Bayatvarkeshi
6. Nadhir Al-Ansari
This article has no evaluationsLatest version Oct 17, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Remotely Sensed Precipitation Estimates Using Hyrbid Machine Learning Models in a Monsoon-Dominated Climate

Flood Prediction with Artificial Intelligence An Exploratory Data Analysis Approach

Employing Artificial Intelligence to Predict δ¹⁸O and δ²H Isotope Ratios in Precipitation in Iraq under Changing Climate Patterns