Spatiotemporal Forecasting in Climate Data Using EOFs and Machine Learning Models: A Case Study in Chile
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Effective resource management and environmental planning in regions with high climatic variability, such as Chile, demand advanced predictive tools. The success in these areas heavily relies on accurately interpreting and forecasting climatic patterns. This study addresses these challenges by employing an innovative and computationally efficient hybrid methodology that integrates machine learning (ML) methods for time series forecasting with established statistical techniques. The spatiotemporal data undergo decomposition using time-dependent Empirical Orthogonal Functions (EOFs), denoted as \(\phi_k(t)\), and their corresponding spatial coefficients, \(\alpha_k(s)\), to reduce dimensionality. Wavelet analysis provides high-resolution time and frequency information from the \(\phi_k(t)\) functions, while neural networks forecast these functions within a medium-range horizon \(h\). By utilizing various ML models, particularly a Wavelet–ANN hybrid model, we forecast \(\phi_k(t+h)\) up to a time horizon \(h\), and subsequently reconstruct the spatiotemporal data using these extended EOFs.This methodology is applied to a grid of climate data comprising 6355 points covering the entire territory of Chile. It transitions from a high-dimensional multivariate spatiotemporal data forecasting problem (involving 6355 time series) to a low-dimensional univariate time series forecasting problem (requiring only a few dozen forecasts). Additionally, cluster analysis with Dynamic Time Warping for defining similarities between rainfall time series, along with spatial coherence and predictability assessments, has been instrumental in identifying geographic areas where model performance is enhanced. This approach also elucidates the reasons behind poor forecast performance in regions or clusters with low spatial coherence and predictability. By utilizing cluster medoids, the forecasting process becomes more practical and efficient. This compound approach significantly reduces computational complexity while generating forecasts of reasonable accuracy and utility. SIGNIFICANCE STATEMENT. The approach outlined in this study facilitates the transition from a high–dimensional multivariate spatiotemporal data forecasting problem to a low-dimensional univariate time series forecasting problem. This transition substantially reduces computational complexity while yielding reasonably accurate forecasts and enhances our ability to interpret and predict climatic patterns across the entire territory and over medium-term temporal horizons, despite its high climatic variability.