De novo learning versus adaptation of continuous control in a manual tracking task

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

How do people learn to perform tasks that require continuous adjustments of motor output, like riding a bicycle? People rely heavily on cognitive strategies when learning discrete movement tasks, but such time-consuming strategies are infeasible in continuous control tasks that demand rapid responses to ongoing sensory feedback. To understand how people can learn to perform such tasks without the benefit of cognitive strategies, we imposed a rotation/mirror reversal of visual feedback while participants performed a continuous tracking task. We analyzed behavior using a system identification approach which revealed two qualitatively different components of learning: adaptation of a baseline controller and formation of a new, task-specific continuous controller. These components exhibited different signatures in the frequency domain and were differentially engaged under the rotation/mirror reversal. Our results demonstrate that people can rapidly build a new continuous controller de novo and can simultaneously deploy this process with adaptation of an existing controller.

Article activity feed

  1. Reviewer #3:

    In the current manuscript (De novo learning and adaptation of continuous control in a manual tracking task), Yang et al. aim to demonstrate that motor adaptation to a mirror reversal perturbation to visual feedback is de-novo learning of a movement controller in contrast to the adaptation of an existing controller with rotation to visual feedback. The authors examine two different experimental paradigms (1) continuous tracking of a cursor (trajectories generated by different sum-of-sinusoid functions) and (2) point to point movements under these two different visual manipulations of the cursor feedback: a 90 deg rotation and mirror reversal. Importantly, the authors set the motion of the cursor under the continuous tracking case as a sum of sinusoidal trajectories in order to perform frequency analysis of the motion tracking. The authors then examine the behavior in the time domain, and dissect the responses at individual frequencies in the frequency domain to determine the response of learning observed in each condition to the fast and slow changing components of the perturbation. There are two major reported results: (1) Participants learn both mirror reversal and rotation learning, but mirror reversal learning shows little to no aftereffect, whereas rotation learning shows an ~25º aftereffect from ~70º of learning. The authors argue that this suggests that mirror-reversal learning arises from a de-novo controller that is not engaged during baseline or washout (Lines 199-200) (2) Learning in the continuous tracking task shows a gradation in performance over frequencies (i.e., higher frequencies demonstrate lower learning). These are interesting experiments, with a well-defined motivation/question and (mostly) clear presentation of results. The figures and results largely support the hypothesis. My specific comments are shown below:

    1. In the abstract, the last line says 'Our results demonstrate that people can rapidly build a new continuous controller de novo and can flexibly integrate this process with adaptation of an existing controller'. It's not clear if the authors have shown the latter definitively. What is the reasoning for this statement, "flexibly integrate this process with adaptation of an existing controller"? It would seem you would need the same subjects to perform both experimental tasks (mirror reversal and VMR) concurrently to make this claim.

    2. It would be helpful if the authors could provide more background/context on their view of de novo learning and explanations on the relationship between de novo learning and the adapted controller model. For example, why does the lack of aftereffects under the mirror-reversal imply that the participants did not counter this perturbation via adaptation and instead engaged the learning by forming a de novo controller (Line 199)? Is the reasoning purely behavioral observations, or is there a physiological basis for this assertion?

    3. Details about frequency analysis are buried deep in the methods (around line 711), especially how the hand-target coherence (shown in 4B) is calculated. It would be helpful to include some of these details in the main text. For example, it is currently very difficult to understand the relationship when from moving from Figure 4A to 4B.

    4. Lines 197-199: The reason for the lack of after-effects in the mean-squared error analysis is a little vague. It took a few tries to understand the reasoning. It would be good to spell this out a little more clearly.

    5. Lines 223-225: The logic behind why coupling across axes is not nonlinear behavior seems to be missing. It's quite unclear and currently difficult to understand. It would be very helpful to spell this out too.

    6. Surprisingly, there is no measurement of aiming in the learning to VMR. Several motor learning studies (several the authors cite) show that learning in VMR is a combination of implicit and explicit. I understand that this is not possible in the continuous tracking task, but can certainly be done in the point to point task. Is there a reason this was not done? Wouldn't this have further supported the author's claim of an existing controller?

    7. Figure 2C: the data for mirror-reversal seems to have a weird uptick in the error. Why would that be? Is there an explanation for this?

    8. Lines 339-342: the results show that mirror-reversal learning is low at high frequencies (Fig 5B). The authors interpret this as reason to believe that this is actually de-novo learning and not adaptation of an existing controller. This seems somewhat unfounded. Could it be that de novo learning performs well at low frequency, through 'catch-up' movements, but not at high frequencies? Do the authors have a counter argument for this explanation?

    9. Lines 343 - 350: The authors ascribe the difference between after-effects and end of learning to be due to de-novo learning even in the rotation group. However, that difference would likely be due to the use of explicit strategy during learning and its disengagement afterwards, or perhaps a temporally labile learning. Can the authors rule these possibilities out? What were the instructions given at the end of the block and how much time elapsed?

    10. Lines 787: Outlier rejection based on some subjects who had greatly magnified or attenuated data seems like it might be biasing the data. Also, the outlier rejection criteria used (>1.5 IQR) seems very stringent. Furthermore, it appears there was no outlier rejection on the main experiment. It would be good to be consistent across experiments.

    11. Figure 4: The authors show the tracking strategies participants applied by investigating the relationship between hand and target movement. The linear relationship would suggest that participants tracked the target using continuous movements. In contrast, a nonlinear relationship would suggest that participants used an alternative tracking strategy. The authors only state this relationship is based on figure 4 but it seems do not provide any proof of the linearity. It would be more convincing to provide an analysis to show that the relationship is indeed linear or nonlinear.

  2. Reviewer #2:

    This manuscript asks how learners solve the problem of continuous motor control. The authors find qualitatively distinct components of learning under continuous tracking conditions: the adaptation of a baseline controller and the formation of a new task-specific continuous controller. These learning components were differentially engaged for rotation-learning and mirror-reversal. Further, the authors present a methodological advance in motor control and learning analysis that relies on frequency-based system identification techniques.

    Overall, this paper presented a valuable third perspective on the learning processes that underlie motor performance and provided an impressive analysis of continuous control data. Furthermore, the system identification technique that they developed will likely be of great value to the study of motor learning. However, I believe that there are some issues with the framing of the de novo learning mechanism and in their interpretation of the results.

    1. Positing a de novo learning mechanism as the absence of established learning process signatures.

    The authors introduce the concept of de novo learning in contrast to both error-driven adaptation and re-aiming: 'a motor task could be learned by forming a de novo controller, rather than through adaptation or re-aiming.' However, the discussion reframes de novo learning as purely in contrast with implicit adaptation: '[...] de novo learning refers to any mechanism, aside from implicit adaptation, that leads to the creation of a new controller'. While this apparent shift in perspective is likely due to their results and realistically represents the scientific process, this shift should be more explicitly communicated.

    As explicitly raised in the discussion and suggested in the introduction, the authors have categorized any learning process that is not implicit adaptation as a de novo learning process. To substantiate this conceptual decision, the authors should further explain why motor learning unaccounted for by established learning processes should be accounted for by a de novo learning process.

    1. The distinction between de novo learning and re-aiming is unclear.

    Participants could not learn mirror-reversal under continuous tracking without the point-to-point task, which the authors interpret to mean that re-aiming is important for the acquisition of a de novo controller. This suggests that re-aiming may not be important for the execution of a de novo controller.

    However, the frequency-based performance analysis presented in the main experiment would seem to suggest otherwise. As mentioned in the introduction, low stimulus frequencies allow a catch-up strategy. Both rotation and mirror groups were successful at compensating at low frequencies but the mirror-reversal group was largely unsuccessful at high frequencies. Assuming that higher frequencies inhibit cognitive strategy, this suggests to me that catch-up strategies might be essential to mirror-reversal, possibly not only during learning but also during execution.

    Further, the authors note that, in the rotation group, aftereffects only accounted for a fraction of total compensation, then suggest that residual learning not accounted for by adaptation was attributable to the same de novo learning process driving mirror reversal. This framing makes it unclear to me how the authors think re-aiming fits into the concept of a de novo learning process (e.g. Is all learning not driven by implicit adaptation de novo learning? What about the role of re-aiming?)

    1. Interpretation of spectral linearity as support for the absence of a catch-up strategy.

    Using linearity as a metric for mechanistic inference has limitations.

    • The absence of learning (errors) would present as nonlinearity.
    • The use of cognitive strategy could present as nonlinearity.
    • It doesn't seem possible to parse the two mechanisms, especially as you might expect both an increase in error at the beginning of learning and possibly an intervening cognitive strategy at the beginning of learning.

    Given these issues, a more grounded interpretation is that linearity simply represents real-time updating. If the relationship between the cursor and the hand is nonlinear, then updating is not in real time.

    The data shown in Fig 4B do not appear to provide clear evidence that the relationship between the cursor and the hand was approximately linear. Currently, it seems equally plausible to say that the data are approximately non-linear. Establishing a criterion for nonlinearity would be useful (e.g. shuffling a linear response for comparison).

    1. The presentation of mean-squared error in Figure 2 seems to have limited utility. As the authors mention, it does not arbitrate between mechanisms or represent the aftereffects observed in rotation learning. I suggest removing panel 2C altogether and magnifying panel 2B so that the reader can better appreciate the raw data.
  3. Reviewer #1 (Timothy Verstynen):

    This work looks at "de novo learning" in the context of fast continuous tasks, i.e., shifts of control policies (or controllers), rather than parameter changes in existing policies that occur with visuomotor adaptation. In a set of 2 experiments, using a mixture of discrete point-to-point movement trials and continuous tracking of moving target trials, the authors set out to determine whether the structure of shifts between visual and proprioceptive information determines whether learning relies on adaptation or shifts in control policies. Using both the presence of post-shift aftereffects and trialwise model fitting, the authors find that, simple rotations of visual inputs of the hand lead primarily to changes in control parameters while mirror reversals lead to changes in the control policy itself. Although there was evidence for a mixture of adaptation and de novo learning in both conditions. The authors infer from this evidence that humans can rapidly and flexibly shift control policies in response to environmental perturbations.

    In general this was a very cleverly designed and executed set of studies. The theoretical framing and experimental design are clean and clear. The data is compelling on the existence of condition differences. However, there are some concerns that temper my acceptance of the key inferences being made about de novo policy shifts.

    Major concerns:

    1. Inferential logic

    There are two key parts to the analyses used to infer that mirror-rotations lead to de novo policy shifts while rotations lead to adaptation. The first is the presence of post-perturbation aftereffects. The second are the alignment matrices (in both immediate hand position and movement frequency spaces), that are estimated based on model fits to the data. I'll consider both in turn.

    First, while we clearly see stronger aftereffects in the rotation condition than in the mirror reversal condition, suggesting a difference in fundamental control mechanisms, it is not clear why control policy shifts are the only alternative explanation for attenuated aftereffects. I'm pretty sure that this is just a confusion based on how the problem is posed in the paper.

    Second, and perhaps more problematically, the alignment matrices (Fig. 3A) and vectors (Fig. 3A, 5B, 6B), based on the model fits, show a very high degree of variability across conditions and do not perfectly align to the simple predictions shown in Fig. 3A. While I do agree that if you squint on the mean vector direction they look qualitatively consistent with the models, but only qualitatively. In fact, the fits to the "ideal" shifts or rotations (Fig. 5C, 6C) suggest only partial alignment to the pure models. How are we sure that this isn't reflecting an alternative mechanism, instead of partial de novo learning?

    In both the aftereffect and alignment fit analyses, the inference for de novo learning seems to be based on either a null (i.e., no aftereffect in mirror-rotation) or partial fits to a specific model. This leaves the main conclusions on somewhat shaky ground.

    1. Linearity analysis

    I had a really hard time understanding the analysis leading to the conclusion that there is a linear relationship between target motion and hand motion. The logic of the spectral analysis was not clear to me and the results shown in Figure 4 were not intuitive. In addition, there was no actual quantification used to make a conclusion about linearity. Thus it was difficult to determine whether this aspect of the authors' conclusion (a critical inference for them to justify their main conclusion) was correct.

    1. Statistical results

    Many of the key statistical results were buried in the main text and some were incompletely reported. Can the authors provide a table (or set of tables) of the key statistics, including at least the value of the statistical test itself and the p-value, if not also estimates of confidence on the estimates?

    1. Experiment 2

    The intention for experiment 2 is to see how much training on the point-to-point task influenced adaptation mechanisms during the tracking task. Yet, this experiment still included extensive exposure to the point-to-point task. Just not as much as in experiment 1. Given this, how can an inference be cleanly made about the influence of one task on the other? Wouldn't the clean way to ask this question be to just not run the point-to-point tracking task at all?

    1. Frequency analysis

    The authors state that "The failure to compensate at high frequencies ... is consistent with the observation that people who have learned to make point-to-point movements under mirror-reversed feedback are unable to generate appropriate rapid corrections to unexpected perturbations." This logic is not clear. How is this inferred based on which movement frequencies show an effect, and which do not, leading to this conclusion?

    Minor comments:

    Pg. 10, line 330: The authors report that "compensation for the visuomotor rotation resulted in reach-direct aftereffects of similar magnitude to that reported in previous studies". Please cite those studies here.

    Pg. 18, lines 661-668: There is only a description of the first experiment but not the second.

    Figure 5, supplement 1 seems to be a critical image for understanding the different dynamics of realignment between the rotation and mirror-reversal tasks. It seems better to have it be a main figure instead of a supplement.