Fictive Learning in Model-based Reinforcement Learning by Generalized Reward Prediction Errors

Jianning Chen
Masakazu Taira
Kenji Doya

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Reinforcement learning (RL) is a normative computational framework to account for reward-based learning. However, widely used RL models, including Q-learning and its variants, fail to capture some key behavioral phenomena observed in animal experiments, particularly the dynamic switch between model-free and model-based control in the two-step task. A fundamental discrepancy is that learning is restricted to experienced outcomes in the models, whereas biological agents may generalize learning to unvisited options based on internal world models, so-called fictive learning. We propose a simple, brain-inspired fictive learning rule and conduct the rodent two-step task to examine whether fictive learning could explain the observed behavior. The learning rule uses a generalized reward prediction error to update both experienced and non-encountered states and actions. The factual prediction error is scaled by the event correlation inferred from the internal model for fictive update. The generalized reward prediction error might be supported by brain-wide dopaminergic broadcasting. Through simulations, we show that this model reproduces key behavioral traits in the two-step task, including stay probabilities and regression analysis, which common RL models fail to explain. Model fitting validates its superior fit over existing alternatives. Furthermore, the model replicates dopaminergic dynamics observed in the same task. This framework bridges normative RL theory and biological learning, offering new insights into adaptive behavior.

Version published to 10.1101/2025.06.12.659433v1 on bioRxiv
Jun 15, 2025

Excessive flexibility? Recurrent neural networks can accommodate individual differences in reinforcement learning through in-context adaptation

This article has 1 author:
1. Kentaro Katahira
This article has no evaluationsLatest version May 5, 2025
Excessive flexibility? Recurrent neural networks can accommodate individual differences in reinforcement learning through in-context adaptation

This article has 1 author:
1. Kentaro Katahira
This article has no evaluationsLatest version May 5, 2025
DynamicRL: Data-Driven Estimation of Trial-by-Trial Reinforcement Learning Parameters

This article has 4 authors:
1. Hua-Dong Xiong
2. Li Ji-An
3. Marcelo G Mattar
4. Robert C Wilson
This article has no evaluationsLatest version Jun 1, 2025

Listed in

Abstract

Article activity feed

Related articles

Excessive flexibility? Recurrent neural networks can accommodate individual differences in reinforcement learning through in-context adaptation

Excessive flexibility? Recurrent neural networks can accommodate individual differences in reinforcement learning through in-context adaptation

DynamicRL: Data-Driven Estimation of Trial-by-Trial Reinforcement Learning Parameters