Early Arousal Signals Drive Reward Learning and Subsequent Choice Behaviour

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Learning to reinforce rewarding actions and avoid repeated mistakes is crucial for survival in dynamic environments. Yet, it remains unclear how distinct neural signals coordinate to implement reward-based decision-making and behavioural adjustment. We obtained simultaneous electroencephalography (EEG) and pupillometry during a probabilistic reversal learning task. Leveraging single-trial EEG, we first replicate the presence of two feedback-locked neural representations; an early signal previously linked to alertness and switching behaviours following negative feedback and a late signal associated with value updating and reward learning. Using single-trial pupillometry, we then show that differences in feedback-evoked pupil responses between positive and negative feedback are driven primarily by negative feedback encoding. Jointly examining these EEG and pupillometry signatures, we show that following negative feedback, increased trial-by-trial coupling between the pupil response and the early, but not the late, EEG signal is linked to increased uncertainty and exploration tendency as well as reduced accuracy and evidence accumulation on the next trial. Consistent with previous research implicating the locus-coeruleus-noradrenaline system in uncertainty signalling and network resets, we propose that when internal estimates of contextual uncertainty are high following negative feedback, an early signal, likely regulated by locus coeruleus activity, implements a network reset in reward learning structures of a later learning signal. This interruption may simultaneously increase the neural gain related to the processing of novel information and decrease the influence of existing representations in reward learning structures, in turn improving performance by creating new, more accurate internal representations of the external world.

Significance Statement

The current study jointly examines EEG and pupillometry signatures associated with reversal-learning during reward-based learning. It suggests that when internal estimates of contextual uncertainty are high following negative feedback, an early neural signal, likely regulated by locus coeruleus activity, implements a network reset in reward learning structures of a later learning signal. This interruption may simultaneously increase the neural gain related to the processing of novel information and decrease the influence of existing representations in reward learning structures, in turn improving performance by creating new, more accurate internal representations of the external world.

Article activity feed