VTA dopamine neuron activity produces spatially organized value representations
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
How does the activity of midbrain dopamine (DA) neurons reinforce actions? A prominent hypothesis is that the activity of ventral tegmental area (VTA) DA neurons instructs representations of predicted reward, or value, in downstream neurons1. To directly test this model, we performed comprehensive striatal recordings in mice engaged in a trial-and-error probabilistic learning task where they continuously adapted their choices to obtain a reward of optogenetic stimulation of VTA DA neurons (paired with an auditory cue). We then assessed neural representations of action values (estimated from a behavioral model), revealing for the first time that VTA DA stimulation is sufficient to generate downstream neural correlates of action value. Surprisingly, these value correlates were strongest in the intermediate caudoputamen (CP) and weakest in the nucleus accumbens (NAc), despite NAc being the major projection target of VTA DA neurons2,3. This was true not only for the value of each choice, but also for state value (reward expectation) and relative value (the decision variable). However, value representations were differentially organized within the intermediate CP, with ventromedial domains (which receive inputs from orbitofrontal cortex) preferentially encoding state value and dorsolateral domains (which receive inputs from motor cortex) preferentially encoding relative value. A difference in learning rate for the value computation between NAc and CP did not explain the relatively weak value correlates in NAc. Instead, we found that VTA DA stimulation was sufficient to produce learned neural responses to the stimulation-paired auditory cue throughout the striatum, including in the NAc, and that animals work for this cue rather than for VTA DA stimulation itself. Overall, this suggests that VTA DA neurons support trial-and-error learning indirectly, by making stimuli valuable ('conditioned reinforcers'), which in turn support the generation of action value representations in the CP.