Regulation of sensorimotor serial learning in speech production by motor compensation rather than sensory error

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife Assessment

    This study investigates how people adapt their speech when auditory feedback is altered. The analyses are rigorous and the work makes a valuable contribution by extending methods from limb motor control to speech. However, because the paradigm does not directly measure sensory error, the evidence for the proposed mechanism of sensorimotor learning is incomplete. The findings are best viewed as evidence for how prior motor adjustments influence subsequent behaviour, highlighting the need for future studies to more precisely separate sensory and motor contributions to adaptation.

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Motor control is essential for organisms to efficiently interact with the environment by maintaining accurate action and adjusting to future changes. Speech production, one of the most complex motor behaviors, relies on a feedback control process to detect sensory errors and trigger updates in a feedforward control process that implements compensations. However, the specific contributions of these critical processes in sensorimotor learning during continuous vocal production remain debated. Here, we used two experimental designs in five experiments to dissociate these mechanisms.

First, we employed a serial-dependence design with randomized pitch perturbations, dissociating the influences of sensory errors and motor compensation on subsequent vocalizations on a trial-by-trial basis. We found that motor compensation, rather than sensory errors, predicted the compensatory responses in the subsequent trials, suggesting instantaneous serial learning mediated by updates in the feedforward process. This compensation-driven serial learning was generalized across productions of different vowel categories. Second, we further implemented a serial-dependence adaptation design in a sentential context, where auditory perturbation occurred only on a preceding syllable. Any learning effects in its subsequent syllable without pitch perturbation would reflect changes in the speech motor representation. Our results consistently revealed that compensation in the preceding syllable predicted pitch changes in the subsequent syllable, but only when the two adjacent syllables were embedded within a word boundary. Collectively, the study provides ecological-valid evidence supporting that error-based motor compensation, incorporating cognitive and linguistic constraints, directly regulates the speech motor representation and mediates the instantaneous serial learning in successive actions.

Article activity feed

  1. eLife Assessment

    This study investigates how people adapt their speech when auditory feedback is altered. The analyses are rigorous and the work makes a valuable contribution by extending methods from limb motor control to speech. However, because the paradigm does not directly measure sensory error, the evidence for the proposed mechanism of sensorimotor learning is incomplete. The findings are best viewed as evidence for how prior motor adjustments influence subsequent behaviour, highlighting the need for future studies to more precisely separate sensory and motor contributions to adaptation.

  2. Reviewer #1 (Public review):

    Summary:

    In this submitted manuscript, Lu, Tang, and colleagues implement a novel serial perturbation paradigm during speech to isolate the effects of sensory and motor processes on compensation. They perform three main studies: in the first study, they validate their method by randomly perturbing pitch in a series of produced vowels. They demonstrate that the amount of perturbation is driven (in part) by the previous trial's amount of motor compensation applied as opposed to the sensory perturbation. In the second experiment, they found that this effect carries over to single vowel words, but the effect was much weaker when different words were produced. Thirdly, the authors reproduce these findings in a more linguistically relevant way (during sentences) and show that the previously shown compensation effect only occurs within syntactic structures and not across them, suggesting an interplay between sensorimotor systems and linguistic structure processing.

    Strengths:

    Overall, this is a very unique study and strikes me as being potentially quite impactful. The authors have performed a large number of experiments to validate their findings that provide novel insights into the processes underlying compensation during speech production. These findings are also likely to produce new avenues for studying the neural mechanisms that support these processes.

    Weaknesses:

    While the authors go to great lengths to disassociate the serial effects of sensory and motor compensation, which is commendable, one weakness is that they are intrinsically linked (motor actions produce sensory consequences). Therefore, there is no obvious way to decouple them for the purposes of investigation. It would be beneficial to discuss future research that could further disentangle these factors.

  3. Reviewer #2 (Public review):

    This study aims to disentangle the contribution of sensory and motor processes (mapped onto the inverse and forward components of speech motor control models like DIVA) to production changes as a result of altered auditory feedback. After five experiments, the authors conclude that it is the motor compensation on the previous trial, and not the sensory error, that drives compensatory responses in subsequent trials.

    Assessment:

    The goal of this paper is great, and the question is timely. Quite a bit of work has gone into the study, and the technical aspects are sound. That said, I just don't understand how the current design can accomplish what the authors have set as their goal. This may, of course, be a misunderstanding on my part, so I'll try to explain my confusion below. If it is indeed my mistake, then I encourage the authors to dedicate some space to unpacking the logic in the Introduction, which is currently barely over a page long. They should take some time to lay out the logic of the experimental design and the dependent and independent variables, and how this design disentangles sensory and motor influences. Then clearly discuss the opposing predictions supporting sensory-driven vs. motor-driven changes. Given that I currently don't understand the logic and, consequently, the claims, I will focus my review on major points for now.

    Main issues

    (1) Measuring sensory change. As acknowledged by the authors, making a motor correction as a function of altered auditory feedback is an interactive process between sensory and motor systems. However, one could still ask whether it is primarily a change to perception vs. a change to production that is driving the motor correction. But to do this, one has to have two sets of measurements: (a) perceptual change, and (b) motor change. As far as I understand, the study has the latter (i.e., C), but not the former. Instead, the magnitude of perceptual change is estimated through the proxy of the magnitude of perturbation (P), but the two are not the same; P is a physical manipulation; perceptual change is a psychological response to that physical manipulation. It is theoretically possible that a physical change does not cause a psychological change, or that the magnitude of the two does not match. So my first confusion centers on the absence of any measure of sensory change in this study.

    To give an explicit example of what I mean, consider a study like Murphy, Nozari, and Holt (2024; Psychonomic Bulletin & Review). This work is about changes to production as a function of exposure to other talkers' acoustic properties - rather than your own altered feedback - but the idea is that the same sensory-motor loop is involved in both. When changing the acoustic properties of the input, the authors obtain two separate measures: (a) how listeners' perception changes as a function of this physical change in the acoustics of the auditory signal, and (b) how their production changes. This allows the authors to identify motor changes above and beyond perceptual changes. Perhaps making a direct comparison with this study would help the reader understand the parallels better.

    (2) A more fundamental issue for me is a theoretical one: Isn't a compensatory motor change ALWAYS a consequence of a perceptual change? I think it makes sense to ask, "Does a motor compensation hinge on a previous motor action or is sensory change enough to drive motor compensation?" This question has been asked for changed acoustics for self-produced speech (e.g., Hantzsch, Parrell, & Niziolek, 2022) and other-produced speech (Murphy, Holt, & Nozari, 2025), and in both cases, the answer has been that sensory changes alone are, in fact, sufficient to drive motor changes. A similar finding has been reported for the role of cerebellum in limb movements (Tseng et al., 2007), with a similar answer (note that in that study, the authors explicitly talk about "the addition" of motor corrections to sensory error, not one vs. the other as two independent factors. So I don't understand a sentence like "We found that motor compensation, rather than sensory errors, predicted the compensatory responses in the subsequent trials", which views motor compensations and sensory errors as orthogonal variables affecting future motor adjustments.

    In other words, there is a certain degree of seriality to the compensation process, with sensory changes preceding motor corrections. If the authors disagree with this, they should explain how an alternative is possible. If they mean something else, a comparison with the above studies and explaining the differences in positions would greatly help.

    (3) Clash with previous findings. I used the examples in point 2 to bring up a theoretical issue, but those examples are also important in that all three of them reach a conclusion compatible with one another and different from the current study. The authors do discuss Tseng et al.'s findings, which oppose their own, but dismiss the opposition based on limb vs. articulator differences. I don't find the authors reasoning theoretically convincing here, but more importantly, the current claims also oppose findings from speech motor studies (see citations in point 2), to which the authors' arguments simply don't apply. Strangely, Hantzsch et al.'s study has been cited a few times, but never in its most important capacity, which is to show that speech motor adaptation can take place after a single exposure to auditory error. Murphy et al. report a similar finding in the context of exposure to other talkers' speech.

    If the authors can convincingly justify their theoretical position in 2, the next step would be to present a thorough comparison with the results of the three studies above. If indeed there is no discrepancy, this comparison would help clarify it.

    References

    Hantzsch, L., Parrell, B., & Niziolek, C. A. (2022). A single exposure to altered auditory feedback causes observable sensorimotor adaptation in speech. eLife, 11, e73694.

    Murphy, T. K., Nozari, N., & Holt, L. L. (2024). Transfer of statistical learning from passive speech perception to speech production. Psychonomic Bulletin & Review, 31(3), 1193-1205.

    Murphy, T. K., Holt, L. L. & Nozari, N. (2025). Exposure to an Accent Transfers to Speech Production in a Single Shot. Preprint available at: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5196109.

    Tseng, Y. W., Diedrichsen, J., Krakauer, J. W., Shadmehr, R., & Bastian, A. J. (2007). Sensory prediction errors drive cerebellum-dependent adaptation of reaching. Journal of neurophysiology, 98(1), 54-62.