When models matter: Environmental demand guides the arbitration between model-based and model-free control

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

As humans, we often repeat previously rewarded actions without thinking, but we also possess the ability to plan ahead and simulate actions based on an internal model of the environment. These two types of control are commonly conceptualised as model-free versus model-based control. While there is a body of research on interindividual differences in using either strategy, we aimed to test whether people can also learn to regulate which strategy to use based on environmental demand. We used a two-stage decision-making task in which two different first-stage states that repeated or alternated between trials shared a deterministic transition structure where actions led to the same second-stage states with drifting reward options. We manipulated how often participants (n = 140) were exposed to alternations versus repetitions between the different first-stage states. When these states frequently repeat, there is a reduced need to consult the transition structure as it pays off to adopt model-free control and simply retake previously rewarded actions. Conversely, when first-stage states frequently alternate, it is more beneficial to adopt model-based control, considering the transition structure and generalising reward outcomes between them. In line with our hypothesis, we show that participants exposed to more first-stage state alternations were also more model-based in a later test phase compared to participants exposed to more first-stage state repetitions. These findings suggest that people learn to arbitrate between different reinforcement-learning strategies consistent with a cost-benefit analysis sensitive to environmental demands.

Article activity feed