Shared mechanisms of auditory and non-auditory vocal learning in the songbird brain

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    McGregor et al. establish a new reinforcement learning paradigm for songbirds, where instead of auditory feedback (white noise) they use mild cutaneous electrical stimulation as a reinforcer. Their data shows that this somatosensory stimulus can aversively drive pitch changes of a targeted syllable in similar manners as an auditory stimulus does. They further show that the anterior forebrain pathway (AFP) and dopaminergic projections to the AFP are necessary for this non-auditory vocal learning by electrolytically lesioning the output nucleus of the AFP and by depleting dopaminergic input to Area X. Their analysis is rigorous and their data convincingly show shared mechanisms for vocal reinforcement learning using white noise (auditory) or cutaneous electrical stimulation (non-auditory).

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #3 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Songbirds and humans share the ability to adaptively modify their vocalizations based on sensory feedback. Prior studies have focused primarily on the role that auditory feedback plays in shaping vocal output throughout life. In contrast, it is unclear how non-auditory information drives vocal plasticity. Here, we first used a reinforcement learning paradigm to establish that somatosensory feedback (cutaneous electrical stimulation) can drive vocal learning in adult songbirds. We then assessed the role of a songbird basal ganglia thalamocortical pathway critical to auditory vocal learning in this novel form of vocal plasticity. We found that both this circuit and its dopaminergic inputs are necessary for non-auditory vocal learning, demonstrating that this pathway is critical for guiding adaptive vocal changes based on both auditory and somatosensory signals. The ability of this circuit to use both auditory and somatosensory information to guide vocal learning may reflect a general principle for the neural systems that support vocal plasticity across species.

Article activity feed

  1. Evaluation Summary:

    McGregor et al. establish a new reinforcement learning paradigm for songbirds, where instead of auditory feedback (white noise) they use mild cutaneous electrical stimulation as a reinforcer. Their data shows that this somatosensory stimulus can aversively drive pitch changes of a targeted syllable in similar manners as an auditory stimulus does. They further show that the anterior forebrain pathway (AFP) and dopaminergic projections to the AFP are necessary for this non-auditory vocal learning by electrolytically lesioning the output nucleus of the AFP and by depleting dopaminergic input to Area X. Their analysis is rigorous and their data convincingly show shared mechanisms for vocal reinforcement learning using white noise (auditory) or cutaneous electrical stimulation (non-auditory).

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #3 agreed to share their name with the authors.)

  2. Reviewer #1 (Public Review):

    Vocal learning has long been assumed to rely mainly on vocal feedback. In finches, birds aim to imitate a memory of a tutor song heard early in life. Here, the authors show that Bengalese finches can also learn from song-targeted cutaneous feedback. When cutaneous stimulation is made contingent on the way a bird sings a note, the bird changes that note to avoid the stimulation. This learning required dopamine-basal ganglia previously implicated in natural vocal learning. Thus vocal circuits can leverage distinct types of sensory feedback for learning.

    The learning paradigm by design lacks ecological validity. Thus it is even more surprising that birds could learn.

  3. Reviewer #2 (Public Review):

    McGregor et al. establish a new reinforcement learning paradigm for songbirds, where instead of auditory feedback (white noise) they use mild cutaneous electrical stimulation as a reinforcer. Their data shows that this somatosensory stimulus can aversively drive pitch changes of a targeted syllable in similar manners as an auditory stimulus does. They further show that the anterior forebrain pathway (AFP) and dopaminergic projections to the AFP are necessary for this non-auditory vocal learning by electrolytically lesioning the output nucleus of the AFP and by depleting dopaminergic input to Area X. Their analysis is rigorous and their data convincingly show shared mechanisms for vocal reinforcement learning using white noise (auditory) or cutaneous electrical stimulation (non-auditory).

    However, both the ability of birds to learn from non-auditory stimuli and the involvement of the AFP in this process have been shown previously in Zai et al. 2020, where it is shown that visual stimuli (short periods of light off) can successfully drive changes in pitch both in hearing and in deaf birds; furthermore, in deaf birds, the involvement of the AFP in this process has been shown using a similar lesioning approach. Thus, two out of the three main claims of novelty in the manuscript are not novel in fact. The main novelty beyond the 2020 study is that McGregor et al. are the first to show that somatosensory information (cutaneous electrical stimulation) can induce vocal plasticity and that dopaminergic projections to the AFP are somehow involved in this process.

    With the interpretation of the dopamine experiments there is an issue. Authors claim that dopaminergic input is necessary for observing adaptive changes, but their data suggests otherwise, namely that dopamine sets the direction of the change. Strictly speaking, the statement 'dopaminergic inputs are required for non-auditory vocal learning' is incorrect, since the data shows reversal in learning direction, which is a form of learning as well. Therefore, the apparent reversal in learning under lack of dopamine should be discussed.

    The authors also claim that there is no systematic difference between learning magnitudes of cutaneous stimulation and of auditory white noise stimulation, suggesting that both training methods result in the same learning efficacy. While their data indeed shows no significant difference between these training methods, there is little ground for this claim. First, learning magnitudes seem to vary a lot across individuals, they may be similar on average but there does not seem to be a correlation between the two. Second, similar learning magnitudes only show that the saliency of the two stimuli were adjusted to be roughly equal, which is not surprising given that they adjusted the magnitude of electric current using a similar criterion as in their initial 2007 paper: In (Tumer and Brainard 2007) they adjusted white noise amplitude until they observed stoppages during the first day of exposure, and in this manuscript they adjusted electric current to interrupt song on the first few instances of cutaneous stimulation.
    They further state in their methods that the magnitude of the electric current was "typically" set in the range 100-350 μA, which is large and which most likely influences the saliency of the reinforcing stimulus. Most importantly, the influence of the electric current magnitude is neither discussed nor analyzed in the manuscript. Presumably, a strong electric current is very effective and a weak auditory stimulus is very ineffective.

    Statistics: Their statistical tests are in general solid and support the claims of the paper. Authors show using a hierarchical bootstrap approach that cutaneous stimulation can drive adult songbird vocal learning on a population level. However, there are a few instances where more data would help to better evaluate the significance of the results. For example, only one of the three days of baseline song is shown and for only one example bird, and worst of all, the data is reduplicated in this bird on two days, which points to a serious flaw in either the analysis or the illustration. Authors should show more baseline days and include more birds.

    The 2-sided KS test to assess the difference between baseline and end of cutaneous stimulation is extremely significant (10^-12) for that one example bird, which is nice, but it would be useful to see whether this is the case for all birds and not just that example bird on that example day. Also, it would be interesting to see how these statistics behave when comparing two or more baseline days. It is unlikely that the washout the KS analysis reveals in this one bird will apply in all birds.

    Then, for the analysis only data between 10 am and 12 pm are used (to account for potential circadian effects) but then this window is extended if birds sing less than 30 renditions of the target syllable during this time window. It is unclear from their description how often this is the case and how it influences their analysis. Furthermore, they exclude birds that dropped their singing rate below 10 songs per day for more than a day, again not stating how many birds were excluded based on this criterion.

  4. Reviewer #3 (Public Review):

    Song learning in songbirds largely depends on the auditory feedback provided by the perception of the bird's own song. Changes in the auditory feedback can drive song plasticity. Indeed, the online processing of the auditory feedback allows the birds to rapidly adjust their singing behavior when facing artificial or natural disturbance in the acoustic domain. Here, McGregor et coll. Investigated whether non-auditory feedback can drive vocal learning in adult male Bengalese finches. They modified a classical reinforcement learning protocol used to study adult birdsong plasticity. In this paradigm, an auditory feedback is contingent on a song syllable feature. Here, the authors used a somatosensory, instead of an auditory, feedback consisting in mild electrical stimulations on the skin made contingent to the song syllable pitch. The results show that this somatosensory feedback drives vocal plasticity in adult birds as efficiently as a contingent auditory feedback (white noise). Using brain lesions and pharmacological approaches, they demonstrate that the basal ganglia-cortical network involved in auditory reinforcement vocal learning is also required for non-auditory reinforcement vocal learning.

    Overall, the experiments are well-designed. I particularly appreciated the fact that, in most of the experiments, the subjects were their own controls which is a clear strength for such surveys (to control for interindividual variability). The data provided support the main conclusions of the paper but some more analyses would strengthen the message. Finally, the paper is overall easy to follow, even for naïve non-birdsong readers.

    My main concern however is that it is quite unfortunate that the authors forget to refer to and discuss the study of Zai, A.T., Cavé-Lopez, S., Rolland, M., Giret, N., Hahnloser, R.H.R., 2020. Sensory substitution reveals a manipulation bias. Nature Communications 11, 5940. https://doi.org/10.1038/s41467-020-19686-w in which non-auditory (visual) reinforcement vocal learning and the contribution of the Area X are shown in adult male zebra finches. I think it is particularly interesting that diverse non-auditory signals can drive vocal learning in adults and that the mechanisms involved seem to be shared. I encourage the authors to introduce and discuss it in order to provide a bigger picture on non-auditory vocal learning but also on reinforcement learning paradigms.

    I understand that the birds can always stop singing in order to get less electrical shocks but the absence of transient effect of the electrical stimulation on the ongoing song (not only song stopping but also FM, pitch, entropy etc.) is not demonstrated. As the authors did quantify some important features (as stated in the methods, l. 567-568), at least one example for some feature (in suppl. Fig) should be shown.

    I wonder why the analysis was restricted to the song syllables that were produced between 10am and 12pm. What is the rationale forsuch a restriction? Are the results different when considering all the song syllables per day? Also, the reader only finds that information in the method section although it seems to me as an important one that needs to be provided in the main text.

    I was a bit surprised by the distribution of the adaptive pitch changes between cutaneous and white noise feedback (fig 2f). The sham and unoperated birds are actually quite different: the unoperated seem to have more change their pitch when exposed to the white noise, while it is the opposite for the sham who seem to change more with the cutaneous stimulation. Could the authors provide some more statistics to justify the pooling of the two groups of birds?

    Surprisingly, five days after the depletion of the DA inputs to the basal ganglia (Area X), there is a change of the pitch in the anti-adaptative direction that reaches statistical significance on day 5 (fig 4c). This effect on the 5th day only might be related to the fact that the depletion of the DA spares about 50% of the inputs to Area X. But what could be the explanation for the change in the anti-adaptative direction?

    The claim in the discussion that the experiments show that LMAN is required for the expression of the non-auditory vocal learning is to my point of view not clearly supported by the data (l.377-379). To me, the data shows that LMAN is required for the non-auditory vocal learning but it is not clearly demonstrated here that only the expression of the learning is ensured by LMAN. In order to show the role of LMAN in the expression of the behavior, the authors should adopt a similar strategy than in Charlesworth et al, Nature 2012 (10.1038/nature11078). So, I would suggest that the authors refer to data from the literature to reach that conclusion.

    The discussion paragraph on the pathway that may convey the somatosensory signal to the song system is interesting but I would encourage the authors to speculate a bit more on the pathway, considering the paper of Chen R, Puzerey PA, Roeser AC, Riccelli TE, Podury A, Maher K, Farhang AR & Goldberg JH (2019). Songbird Ventral Pallidum Sends Diverse Performance Error Signals to Dopaminergic Midbrain. Neuron; DOI: 10.1016/j.neuron.2019.04.038 (already cited), the related review from Chen R & Goldberg JH (2020). Actor-critic reinforcement learning in the songbird. Current Opinion in Neurobiology 65, 1-9 (http://www.sciencedirect.com/science/article/pii/S0959438820301173) and the study of Wild JM (1994). Visual and somatosensory inputs to the avian song system via nucleus uvaeformis (Uva) and a comparison with the projections of a similar thalamic nucleus in a nonsongbird, Columbia livia. J Comp Neurol 349, 512-535 (http://onlinelibrary.wiley.com/doi/10.1002/cne.903490403/abstract).