Research on Deep Learning Financial Volatility Prediction Method Based on Signal Decomposition and Data Augmentation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Accurate forecasting of financial volatility is critical for risk management and investment decision-making. This study proposes a novel three-stage hybrid model (C-WG-BL) integrating Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN), Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP), and Bidirectional Long Short-Term Memory (BiLSTM) networks. First, CEEMDAN decomposes the original volatility series into multi-scale Intrinsic Mode Functions (IMFs). Next, WGAN-GP is applied to high-frequency IMFs containing major noise and microstructural fluctuations, generating high-quality synthetic data to expand the training set and enhance the model’s ability to capture complex patterns. Finally, BiLSTM forecasts all IMFs (augmented high-frequency and original low-frequency), and the results are integrated to reconstruct the final prediction. Empirical analysis using 1-minute high-frequency data from the SSE 50, CSI 300, CSI 500, and SSE Composite Index shows that C-WG-BL significantly outperforms mainstream deep learning and ablation models. For the SSE 50, it improves R2 by 5.38% and reduces MAPE from 19.197% to 7.432% (–11.76 percentage points) compared with the best baseline (C-BiLSTM). The model also maintains high accuracy under extreme conditions such as the early COVID-19 outbreak, and Diebold–Mariano tests confirm statistically significant error reductions. This study offers an efficient, robust, and generalizable solution for high-frequency financial volatility prediction.