A Unified Reinforcement Learning Framework for Dynamic User Profiling and Predictive Recommendation

Yining Zhou

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This paper proposes a unified modeling approach based on reinforcement learning to address the problem of dynamic user profiling and behavior prediction. Profile updating and next-step behavior prediction are formulated as a continuous decision process, where the state is composed of the current profile snapshot and interaction history, the action corresponds to profile updating and recommendation strategy selection, and the reward is driven by user feedback signals. The method models the evolution of user states through a Markov decision process and achieves adaptive iteration of user profiles by applying policy optimization and value function estimation. To ensure balanced modeling, the study integrates a joint objective function of profile updating and behavior prediction within the overall optimization, thereby enhancing long-term stability and personalization. In the experimental design, different methods are systematically compared in terms of accuracy, ranking metrics, and cumulative reward, and the sensitivity of the model under hyperparameter changes, environmental variation, and data disturbance is analyzed. The results show that the proposed method achieves superior performance across multiple evaluation metrics, verifying the effectiveness of the reinforcement learning framework in realizing dynamic profiling and precise prediction in complex interactive environments. This study not only establishes a unified theoretical model but also demonstrates its adaptability and robustness in dynamic settings, providing a systematic solution for user profiling and behavior prediction tasks.

Version published to 10.20944/preprints202510.1143.v1
Oct 15, 2025

Multimodal Interest-shifting Sequence Recommendation Algorithm Based on Reinforcement Learning

This article has 10 authors:
1. Changcheng Shao
2. Cheng Zeng
3. Xiaogang Ye
4. Lili Chen
5. Qianyu Zou
6. Hongzhen Zhu
7. Zhouqiang Qiu
8. Yunhua Chen
9. Pinghua Chen
10. Hongsong Zheng
This article has no evaluationsLatest version Jan 29, 2026
A Brief Tutorial on Reinforcement Learning: From MDP to DDPG

This article has 1 author:
1. Tian Zhang
This article has no evaluationsLatest version Jan 6, 2026
Deep Learning-Based Uncertainty-Driven Robust Time Series Forecasting for Backend Service Metrics

This article has 6 authors:
1. Sijia Li
2. Chengda Xu
3. Chi Zhang
4. Bolin Chen
5. Zizhao Zhang
6. Zixiao Huang
This article has no evaluationsLatest version Jan 29, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Multimodal Interest-shifting Sequence Recommendation Algorithm Based on Reinforcement Learning

A Brief Tutorial on Reinforcement Learning: From MDP to DDPG

Deep Learning-Based Uncertainty-Driven Robust Time Series Forecasting for Backend Service Metrics