Evolving learning state reactivation and value encoding neural dynamics in multi-step planning
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Planning in value-based decision making is often dynamic, with reinforcement learning (RL) providing a powerful framework for investigating how value and action at each step change across trials. Surprisingly, the evolving neural signatures of value estimation and state reactivation in multi-step planning, both within and across trials, have received little consideration. Here, using magnetoencephalography (MEG), we detail neural dynamics associated with planning, wherein subjects were tasked to find an optimal path in order to maximise reward. Behavioural evidence showed improved performance across trials, including subjects showing an increasing disregard for low-value states. MEG data captured evolving value estimation signals such that, across trials, there was an emergence of stronger and earlier within trial value encoding linked to boosted vmPFC activity. Value encoding signals showed a positive correlation and individual performance metrics, as reflected in overall task-related reward earnings. Strikingly, across trials, there was an attenuation of state reactivation for negative-value states, an effect that positively correlated with evolving negative-value state avoidance behaviour. The finding linking neural dynamics, including a valence-dependent selective reactivation of negative states, to across-trial behavioural improvement advances an understanding of learning during multi-step planning.