Sniffing Shapes Dopamine Signals for Reward Prediction
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Adaptive behaviors depend on predicting outcomes from sensory evidence. Dopamine neurons in the ventral tegmental area (VTA) broadcast reward-prediction signals that guide learning. Yet the principles functionally coordinating information flow from input regions to VTA are incompletely understood. In the olfactory system, the sniff cycle structures sampling and odor encoding. We therefore asked whether this rhythm also entrains the ventral striatum to VTA communication and if so, how this shapes the implementation of predictive coding in dopamine neurons. We recorded identified dopamine neurons throughout olfactory conditioning and found that their firing shifted systematically to the post-inspiratory phase of the sniff cycle with learning. This temporal realignment predicted a neuron’s engagement in value encoding along the optimism-pessimism-spectrum of distributional reinforcement learning. This is associated with an enhanced phase-gated communication channel from the striatal olfactory tubercle to dopamine neurons, the strength of which predicts task performance. Thus, the sniffing rhythm provides a scaffold for information flow, revealing a phase-gating mechanism for the integration of outcome predicting sensory evidence to dopamine neurons during reinforcement learning.
Significance Statement
Predicting the future from sensory cues is central to adaptive behaviors. We show that sniffing temporally organizes the flow of information between the olfactory tubercle of ventral striatum and midbrain dopamine neurons during odor-guided learning in mice. Phase-specific coupling in the respiratory cycle determines when sensory information reaches dopamine neurons and how predictive signals are encoded. These findings link active-sensing rhythms to predictive reinforcement signals. This temporal scaffold yields a gradient of “optimistic” to “pessimistic” predictions consistent with distributional reinforcement-learning theories. These insights contribute to a better understanding of the large-scale computations across multiple brain regions to predict future outcomes.