A Unified Framework for Stock Price Prediction: Integrating NLP-Based Sentiment, Dimensionality Reduction and Regularization

Erdem Korhan Akçay
İsmail Yenilmez

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This study examines the effect of news articles on stock price prediction and evaluates the role of dimensionality reduction and regularization techniques in improving forecasting performance. Four natural language processing (NLP) variables, Sentiment Score, Sentiment Polarity, VADER Compound, and Lexicon Score, were extracted from news texts and integrated with traditional time series indicators. Variable selection and dimensionality reduction were performed using Elastic Net, LASSO, PCA, PCA + Elastic Net, and PCA + LASSO methods. The constructed datasets, combining time series and NLP-based variables, were tested with ARIMAX, ANN, LSTM, and GRU models. The analyses, carried out through both simulation studies and applications on eight stock data series, revealed that incorporating NLP variables alongside technical indicators significantly enhances prediction accuracy. Furthermore, hybrid approaches such as PCA combined with Elastic Net or LASSO proved effective in reducing feature space complexity while preserving predictive power. Overall, the findings demonstrate that integrating dimensionality reduction, regularization techniques, and sentiment-based news analysis into traditional time series forecasting provides a comprehensive and robust framework for more accurate stock price prediction. MSC Classification: 68T07 , 68T50 , 62M10 , 62H25 , 62J99

Version published to 10.21203/rs.3.rs-8194360/v1 on Research Square
Dec 3, 2025

Applying Multiple Linear Regression to Enhance Short-Term Stock Forecasting Accuracy

This article has 2 authors:
1. TOUSIF AL RASHID
2. Raj Kumar
This article has no evaluationsLatest version Dec 15, 2025
Construction and analysis of data model for financial market volatility prediction based on support vector machine

This article has 1 author:
1. XiaoMeng Su
This article has no evaluationsLatest version Jan 21, 2026
Forecasting Crude Oil Prices: Insights from Machine Learning Approaches

This article has 1 author:
1. Haseen
This article has no evaluationsLatest version Dec 16, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Applying Multiple Linear Regression to Enhance Short-Term Stock Forecasting Accuracy

Construction and analysis of data model for financial market volatility prediction based on support vector machine

Forecasting Crude Oil Prices: Insights from Machine Learning Approaches