From LSTM to GPT-2: Recurrent and Transformer-Based Deep Learning Architectures for Multivariate High-Liquidity Cryptocurrency Price Forecasting

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This study introduces a unified and methodologically symmetric comparative framework for multivariate cryptocurrency forecasting, addressing long-standing inconsistencies in prior research where model families, feature sets, and preprocessing pipelines differ across studies. Under an identical and rigorously controlled experimental setup, we benchmark six deep learning architectures—LSTM, GPT-2, Informer, Autoformer, Temporal Fusion Transformer (TFT), and a Vanilla Transformer—together with four widely used econometric models (ARIMA, VAR, GARCH, and a Random Walk baseline). All models are evaluated using a shared multivariate feature space composed of more than forty technical indicators, identical normalization procedures, harmonized sliding-window formations, and aligned temporal splits across five high-liquidity assets (BTC, ETH, XRP, XLM, and SOL). The experimental results show that transformer-based architectures consistently outperform both the recurrent baseline and classical econometric models across all assets. This superiority arises from the ability of attention mechanisms to capture long-range temporal dependencies and adaptively weight informative time steps, whereas recurrent models suffer from vanishing-gradient limitations and restricted effective memory. The best-performing deep learning models achieve MAPE values of 0.0289 (BTC, GPT-2), 0.0198 (ETH, Autoformer), 0.0418 (XRP, Informer), 0.0469 (XLM, Informer), and 0.0578 (SOL, TFT), substantially improving upon the performance of both LSTM and all econometric baselines. These findings highlight the effectiveness of attention-based architectures in modeling volatility-driven nonlinear dynamics and establish a reproducible, symmetry-preserving benchmark for future research in deep-learning-based financial forecasting.

Article activity feed