Reinforcement Learning for Real-World Non-Stationary Systems: An Observation-Aware Survey
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Reinforcement learning (RL) has achieved notable success in simulated environments and controlled benchmarks, yet its deployment in real-world, safety-critical systems remains limited. Practical settings such as robotics, healthcare, and industrial control are characterized by partial and noisy observability, costly information acquisition, limited and imbalanced data, and non-stationary dynamics, all of which violate standard assumptions underlying classical RL formulations. A growing body of work has shown that these challenges are tightly coupled: limitations in observation fundamentally shape learning efficiency, robustness, and safety. This article presents a survey and synthesis of recent reinforcement learning literature through the lens of observation-aware decision making. Rather than proposing new algorithms, we organize existing model-free, model-based, and offline RL methods according to how they control information acquisition, represent uncertainty, and adapt under distribution shift. We review theoretical and empirical results on partial observability, data imbalance, safety constraints, and non-stationarity, highlighting common structural assumptions and failure modes across approaches. By unifying these themes, the survey clarifies how observation design influences policy learning, evaluation, and deployment reliability. We conclude by identifying open methodological challenges and directions for future research in reinforcement learning for real-world, non-stationary environments.