From LSTM to GPT-2: Recurrent and Transformer-Based Deep Learning Architectures for Multivariate High-Liquidity Cryptocurrency Price Forecasting
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This study presents a comprehensive comparative analysis of recurrent and transformer-based deep learning architectures for multivariate cryptocurrency price forecasting. Five high-liquidity digital assets—Bitcoin (BTC), Ethereum (ETH), Ripple (XRP), Stellar (XLM), and Solana (SOL)—are modeled using an extensive technical indicator set generated through the pandas_ta library, incorporating trend-, momentum-, volatility-, and volume-based features. The experimental framework evaluates six architectures—LSTM, GPT-2, Informer, Autoformer, Temporal Fusion Transformer (TFT), and the Vanilla Transformer—under a unified preprocessing pipeline comprising data cleansing, missing-value imputation, normalization, and sliding-window sequence generation. Models are trained and tested using identical temporal partitions and optimization strategies to ensure methodological consistency. Forecasting performance is assessed through multiple error metrics, including MSE, MAE, RMSE, MAPE, and R2. The results indicate that transformer-based architectures generally outperform recurrent models in capturing long-range dependencies and complex feature interactions, particularly in multivariate settings with rich technical indicator inputs. Informer and Autoformer exhibit strong stability in longer horizons, whereas GPT-2 achieves competitive short-term accuracy despite its computational demands. Observed challenges include normalization inconsistencies, hyperparameter sensitivity, and significant training costs associated with large-scale transformer architectures. Overall, the findings highlight the potential of modern transformer-based approaches as robust and scalable alternatives for high-frequency cryptocurrency forecasting.