Working Memory and Reinforcement Learning Interactions when Simultaneously Pursuing Reward and Avoiding Punishment: No Relationship to Internalizing Symptoms
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Humans learn adaptive behaviors via a durable but incremental reinforcement-learning (RL) system and a fast but fleeting working memory (WM) system. Past work parsing these systems focused on reward learning alone. Hence, little is known about how they interact while simultaneously learning to avoid punishment, and whether arbitrating between these demands is disrupted by psychiatric symptoms. We administered a novel reward/punishment RL-WM task to an online sample oversampled for depression and anxiety symptoms (N=298; n=275 after quality control). We found participants avoided punishment during initial learning, yet poorly retained this avoidance. Computational modeling captured this pattern by allowing the fleeting WM system to facilitate punishment avoidance, while the durable RL system retained little about punishment. We also found individual differences in a previously documented tendency for WM to blunt the RL system. Specifically, while we indeed found that a subset of high-blunting participants paradoxically showed the worst (RL-based) ultimate retention when WM had facilitated initial learning, another subset enjoyed the greatest ultimate retention in this scenario. Importantly, we found that the retention patterns in the task could not be captured by a computational model in which the RL system was replaced by a stimulus-response one. Finally, we found that task performance (analyzed behaviorally and via computational modeling) was largely spared as a function of depression/anxiety and trait rumination. Overall, our findings provide insight into how WM and RL interact when facing the ubiquitous challenge of attaining reward while simultaneously avoiding punishment, and demonstrate intact behavior under internalizing-disorder symptoms.