Model-Based Reinforcement Learning Control for Non-Linear Dynamics
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Safe and sample-conscious controller synthesis for nonlinear dynamics benefits from reinforcement learning that exploits a model of the plant. A nonlinear mass–spring–damper with hardening effects and hard stops is considered. Two data-driven models are employed to enable off-plant training: a piecewise linear model assembled from operating-region linear descriptions and blended by triangular memberships, and a global nonlinear autoregressive model with exogenous input constructed from past inputs and outputs. Q-learning is performed with the model in the loop using an error-indexed discrete state space, a finite force alphabet, and a reward that balances absolute tracking error with its short-horizon decrease. When the trained agents are deployed on the true plant for reference tracking, the piecewise linear model tends to yield tighter regulation near the setpoint and reduced steady-state bias, while the nonlinear autoregressive route requires less prior structural knowledge and a simpler data-collection campaign, at the cost of larger residual error in the tested scenario. These findings indicate that model-based Q-learning with data-driven models enables off-plant policy learning while containing experimental risk. Observed performance reflects a trade-off between fidelity obtained from localized linearization and generality afforded by global nonlinear regression, as well as design choices in state discretization and reward shaping. Prospective improvements include adaptive membership shaping, richer regressors, and limited on-plant refinement to reduce model–plant mismatch.