Benchmarking modeling architectures for cryptocurrency price prediction using financial and social media data

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The volatility of cryptocurrencies necessitates reliable short-term price prediction models for informed investment decisions. This work presents two benchmarking studies that predict cryptocurrency price over hourly and daily time horizons using market indicators and social media data. Study 1 used BERT-based sentiment analysis of hourly Twitter data combined with financial indicators, while Study 2 applied VADER sentiment analysis to daily Twitter and Google Trends data alongside financial indicators. Both studies systematically evaluated statistical models (ARIMA, ARIMAX), machine learning approaches (SVR), and deep learning architectures (1D-CNN, LSTM) including ensemble, multi-modal, and hybrid configurations. Particular attention was given to the influence of lag periods, data aggregation, and sentiment analysis nuances on cryptocurrency price. Empirical results identify LSTM as the best-performing singular prediction model, achieving a 64.5% reduction in RMSE (4.56e - 03) compared with the SVR baseline in Study 1. In Study 2, the hybrid LSTM + ARIMA model delivered the strongest performance, reducing RMSE by 32.5% (RMSE=2.55e+02) relative to the best performing singular baseline. Hybrid architectures combining LSTM with ARIMA or ARIMAX consistently achieved the lowest RMSE values, outperforming all other configurations and proving especially effective at capturing price movements and turning points. These findings demonstrate how combining statistical methods with deep learning can address non-stationarity, improve sentiment preprocessing, and enhance model interpretability.

Article activity feed