A Methodological Comparison of Forecasting Models Using KZ Decomposition and Walk-Forward Validation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The capacity to precisely anticipate surface air temperature (T2M) is critical. It provides the foundation for successful early warning systems, water resource management, and climate science. One major issue for traditional models is that environmental forces are not uniform; they have an impact on many time scales, ranging from short-term oscillations to seasonal, long-term trends. In this study, we separate both predictors and the target variable into short-term, seasonal, and long-term components using the Kolmogorov–Zurbenko (KZ) filter. Each component is modeled independently using three classical regression methods (linear regression, Ridge Regression, and Lasso Regression) and two machine learning algorithms (Random Forest and XGBoost). The predicted components are then recombined using an additive framework. Although the KZ filter has been extensively used in air quality research, this work is the first to integrate it with both classical regression and advanced machine learning for T2M forecasting. Using walk-forward validation, we find that component-wise modeling consistently outperforms direct modeling of the raw series. XGBoost shows the most significant gain, increasing R2 from 0.80 to 0.91 and decreasing RMSE from 0.44 to 0.29.