Fictive Learning in Model-based Reinforcement Learning by Generalized Reward Prediction Errors
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Reinforcement learning (RL) is a normative computational framework to account for reward-based learning. However, widely used RL models, including Q-learning and its variants, fail to capture some key behavioral phenomena observed in animal experiments, particularly the dynamic switch between model-free and model-based control in the two-step task. A fundamental discrepancy is that learning is restricted to experienced outcomes in the models, whereas biological agents may generalize learning to unvisited options based on internal world models, so-called fictive learning. We propose a simple, brain-inspired fictive learning rule and conduct the rodent two-step task to examine whether fictive learning could explain the observed behavior. The learning rule uses a generalized reward prediction error to update both experienced and non-encountered states and actions. The factual prediction error is scaled by the event correlation inferred from the internal model for fictive update. The generalized reward prediction error might be supported by brain-wide dopaminergic broadcasting. Through simulations, we show that this model reproduces key behavioral traits in the two-step task, including stay probabilities and regression analysis, which common RL models fail to explain. Model fitting validates its superior fit over existing alternatives. Furthermore, the model replicates dopaminergic dynamics observed in the same task. This framework bridges normative RL theory and biological learning, offering new insights into adaptive behavior.