Two time scales of adaptation in human learning rates
Curation statements for this article:-
Curated by eLife
eLife Assessment
This study makes a valuable contribution by separating two timescales of adaptation: rapid, within block reductions in learning rate, and slower, location specific, meta-learned adjustments. Behavioural data and computational modeling converge to support both processes. The evidence is solid with neuroimaging results suggesting that meta-learned learning rates are encoded in the orbitofrontal cortex, while prediction errors are represented in a distributed network including the ventral striatum and are modulated by expected error magnitude, though the specificity of these effects requires further contextualization. The manuscript is timely and clearly written; its main limitation is the weak linkage between neural signals and behavior, leaving uncertainty over whether the reported signals play a mechanistic role in learning.
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussionListed in
- Evaluated articles (eLife)
Abstract
Abstract
Different situations may require radically different information updating speeds (i.e., learning rates). Some demand fast learning rates, while others benefit from using slower ones. To adjust learning rates, people could rely on either global, meta-learned differences between environments, or faster but transient adaptations to locally experienced prediction errors. Here, we introduce a new paradigm that allows researchers to measure and empirically disentangle both forms of adaptations. Participants performed short blocks of trials of a continuous estimation task – fishing for crabs – on six different islands that required different optimal (initial) learning rates. Across two experiments, participants showed fast adaptations in learning rate within a block. Critically, participants also learned global environment-specific learning rates over the time course of the experiment, as evidenced by computational modelling and by the learning rates calculated on the very first trial when revisiting an environment (i.e., unconfounded by transient adaptations). Using representational similarity analyses of fMRI data, we found that differences in voxel pattern responses in the central orbitofrontal cortex correlated with differences in these global environment-specific learning rates. Our findings show that humans adapt learning rates at both slow and fast time scales, and that the central orbitofrontal cortex may support meta-learning by representing environment-specific task-relevant features such as learning rates.
Article activity feed
-
eLife Assessment
This study makes a valuable contribution by separating two timescales of adaptation: rapid, within block reductions in learning rate, and slower, location specific, meta-learned adjustments. Behavioural data and computational modeling converge to support both processes. The evidence is solid with neuroimaging results suggesting that meta-learned learning rates are encoded in the orbitofrontal cortex, while prediction errors are represented in a distributed network including the ventral striatum and are modulated by expected error magnitude, though the specificity of these effects requires further contextualization. The manuscript is timely and clearly written; its main limitation is the weak linkage between neural signals and behavior, leaving uncertainty over whether the reported signals play a mechanistic role in …
eLife Assessment
This study makes a valuable contribution by separating two timescales of adaptation: rapid, within block reductions in learning rate, and slower, location specific, meta-learned adjustments. Behavioural data and computational modeling converge to support both processes. The evidence is solid with neuroimaging results suggesting that meta-learned learning rates are encoded in the orbitofrontal cortex, while prediction errors are represented in a distributed network including the ventral striatum and are modulated by expected error magnitude, though the specificity of these effects requires further contextualization. The manuscript is timely and clearly written; its main limitation is the weak linkage between neural signals and behavior, leaving uncertainty over whether the reported signals play a mechanistic role in learning.
-
Reviewer #1 (Public review):
Summary:
Simoens and colleagues use a continuous estimation task to disentangle learning rate adjustments on shorter and longer timescales. They show that participants rapidly decrease learning rates within a block of trials in a given "location", but that they also adjust learning rates for the very first trial based on information accrued gradually about the statistics of each location, which can be viewed as a form of metalearning. The authors show that the metalearned learning rates are represented in patterns of neural activity in the orbitofrontal cortex, and that prediction errors are represented in a constellation of brain regions, including the ventral striatum, where they are modulated by expectations about error magnitude to some degree. Overall, the work is interesting, timely, and well …
Reviewer #1 (Public review):
Summary:
Simoens and colleagues use a continuous estimation task to disentangle learning rate adjustments on shorter and longer timescales. They show that participants rapidly decrease learning rates within a block of trials in a given "location", but that they also adjust learning rates for the very first trial based on information accrued gradually about the statistics of each location, which can be viewed as a form of metalearning. The authors show that the metalearned learning rates are represented in patterns of neural activity in the orbitofrontal cortex, and that prediction errors are represented in a constellation of brain regions, including the ventral striatum, where they are modulated by expectations about error magnitude to some degree. Overall, the work is interesting, timely, and well communicated. My primary concern with the work was that the link between the brain signals and their role in the behavior of interest was not well explored, raising some questions about the degree to which signals are really involved in the learning process, versus playing some downstream role.
Strengths:
The authors build on an interesting task design, allowing them to distinguish moment-to-moment adjustments in learning rate from slower adjustments in learning rate corresponding to slowly-gained knowledge about the statistics of specific "locations". Behavior and computational modeling clearly demonstrate that individuals adjust to environmental statistics in a sort of metalearning. fMRI data reveal representations of interest, including those related to adjusted learning rates and their impact on the degree of prediction error encoding in the striatum.
Weaknesses:
It was nice to see that the authors could distinguish differences between the OFC signals that they observed and those in the visual regions based on changes through the session. However, the linkage between these brain activations and a functional role in generating behavior was left unexplored. Without further exploration, it is hard to tell exactly what role the signals might be playing, if any, in the behavior of interest.
-
Reviewer #2 (Public review):
Summary:
Across two experiments, this work presents a novel spatial predictive inference paradigm that facilitates the investigation of meta-learning across multiple environments with distinct statistics, as well as more local learning from sequences of observations within an environment. The authors present behavioral data indicating that people can indeed learn to distinguish between noise levels and calibrate their learning rates accordingly across environments, even on initial trials when revisiting an environment. They complement their behavioral results with computational modeling, further bolstering claims of both local and global adaptation. Additional fMRI results support the role of OFC in this meta-learning process, with central OFC activity reflecting similarity between environments. This …
Reviewer #2 (Public review):
Summary:
Across two experiments, this work presents a novel spatial predictive inference paradigm that facilitates the investigation of meta-learning across multiple environments with distinct statistics, as well as more local learning from sequences of observations within an environment. The authors present behavioral data indicating that people can indeed learn to distinguish between noise levels and calibrate their learning rates accordingly across environments, even on initial trials when revisiting an environment. They complement their behavioral results with computational modeling, further bolstering claims of both local and global adaptation. Additional fMRI results support the role of OFC in this meta-learning process, with central OFC activity reflecting similarity between environments. This similarity emerges over time with task experience. Holistically, this paradigm and these data add to our understanding of how humans dynamically adapt their behavior on different timescales.
Strengths:
The novel paradigm represents a clever and creative expansion of spatial predictive inference tasks. The cover story was well chosen to facilitate an intuitive understanding of both the differences between environments and the estimation of the mean within environments.
Additionally, the authors present complementary results from two experiments, which strengthen the behavioral findings. This is especially effective as the initial experiment's results were a bit noisy, and the modifications within the second experiment increased both power and the specificity/accuracy of participant predictions. Taken together, the behavioral results provide convincing evidence that participants did distinguish environments based on their underlying statistics and adapted their initial behavior accordingly.
Beyond this, the combination of behavioral results, computational modeling, and neuroimaging enhances the impact of the work. It paints a fuller picture of whether and how humans meta-learn the global statistics of environments, and this is an important direction for the field of adaptive learning.
Weaknesses:
(1) The authors make the distinction between meta-learned "global" learning rates and within-environment learning rate adaptation in response to "local" fluctuations/observations. Though the experimental paradigm is novel, there are certainly links to prior work - for instance, though change point structures don't entail revisiting unique environments, they do require meta-learning from environmental statistics that is distinct from transient local adaptation to prediction errors. This tendency to increase one's learning rate after large prediction errors is appropriate in change point environments, though, as is true in this study, the amount of increase should be dependent on. This represents a similar kind of slower-timescale learning or reuse of more "global" parameters, and can be seen to different extents in prior work. It might benefit readers if the authors were to link the current work to previous research more explicitly to draw clearer connections between the approaches and findings.
(2) Throughout much of the paper, the authors refer to the distinctions between environments primarily as differences in "initial learning rates" or "environment-specific learning rates." This is particularly prominent when discussing fMRI results. Though the optimal initial learning rate did differ across environments, this was the result of differences in underlying task statistics. It will be important to clarify this throughout the text, because of the confounds between task statistics and initial learning rate (and to some extent, the position on the screen), it is not possible to separate the impact of these specific variables. This is also relevant to understanding the justification for using methods like RSA to test whether brain regions represent task states similarly. If the main hypothesis is that neural activity reflects the (initial) learning rate itself, then a univariate analysis approach would seem more natural.
(3) For the neuroimaging results in particular, the specificity of some of the results (e.g. ventral striatum showing an effect of prediction error only in the low noise condition in the second half of task experience, only on the first trial) is a bit surprising. Additional justification of or context for these results would be useful to help readers gauge how expected or surprising these findings are.
(4) There are some methodological details that are unclear (e.g., how were the positions of the crabs selected relative to the location they emerged from? Looking at Figure 1C, it looks like the crabs spread out unevenly, and that the single position they emerge from is not necessarily at the center of the crab locations.) Additional detail and clarity would help address some unanswered questions (more details below).
-
-
-