Improved sensory representations as a result of temporal adaptation
Curation statements for this article:-
Curated by eLife
eLife Assessment
This valuable study examined how sensory adaptation supports visual perception in the presence of noise. The authors used a combination of human psychophysics, electroencephalography (EEG), and deep neural networks to show that adaptation to noise can improve perception. The results are solid but are, at present, weakened by a number of concerns, including some related to the experimental design and some regarding the interpretation of the results in terms of particular mechanisms. With these concerns adequately addressed, the study and conclusions would be likely to be of broad interest to the neuroscience community.
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (eLife)
Abstract
Human perception is robust under challenging conditions, for example when sensory inputs change over time. Temporal adaptation in the form of reduced responses to repeated external stimuli is ubiquitously observed in the brain, yet it remains unclear how repetition suppression aids recognition of novel inputs. To clarify this, we collected behavioural and electrocorticography (EEG) measurements while human participants categorized objects embedded in visual noise patterns after first viewing these patterns in isolation, inducing adaptation to the noise stimulus. We furthermore manipulated the availability of object information in the visual input by varying the contrast of the noise-embedded objects. Our results provide convergent behavioral, neural and computational evidence of a benefit of temporal adaptation on sensory representations. Adapting to a noise pattern resulted in overall faster object recognition and better recognition of objects as object contrast increased. These adaptation-induced behavioral improvements were accompanied by more pronounced contrast-dependent modulation of object-evoked EEG responses, and better decoding of object information from EEG activity. To identify potential neural computations mediating the benefits of temporal adaptation on object recognition, we equipped task-optimized deep convolutional neural networks (DCNNs) with different candidate mechanisms to adjust network activations over time. DCNNs with intrinsic adaptation mechanisms, such as additive suppression, best captured contrast-dependent human performance benefits, whilst also showing improved object decoding as a result of adaptation. Finally, adaptation effects in networks that use temporal divisive normalization, a biologically-plausible canonical neural computation, were most robust to spatial shifts, suggesting that temporal adaptation via divisive normalization aids stable representations of time-varying visual inputs. Overall, our results demonstrate how temporal adaptation improves sensory representations and identify candidate neural computations mediating these effects.
Article activity feed
-
-
eLife Assessment
This valuable study examined how sensory adaptation supports visual perception in the presence of noise. The authors used a combination of human psychophysics, electroencephalography (EEG), and deep neural networks to show that adaptation to noise can improve perception. The results are solid but are, at present, weakened by a number of concerns, including some related to the experimental design and some regarding the interpretation of the results in terms of particular mechanisms. With these concerns adequately addressed, the study and conclusions would be likely to be of broad interest to the neuroscience community.
-
Reviewer #1 (Public review):
The authors sought to investigate the role of adaptation in supporting object recognition. In particular, the extent to which adaptation to noise improves subsequent recognition of objects embedded in the same or similar noise, and how this interacts with target contrast. The authors approach this question using a combination of psychophysics, electroencephalography, and deep neural networks. They find better behavioural performance and multivariate decoding of stimuli preceded by noise, suggesting a beneficial effect of adaptation to noise. The neural network analysis seeks to provide a deeper explanation of the results by comparing how well different adaptation mechanisms capture the empirical behavioural results. The results show that models incorporating intrinsic adaptation mechanisms, such as additive …
Reviewer #1 (Public review):
The authors sought to investigate the role of adaptation in supporting object recognition. In particular, the extent to which adaptation to noise improves subsequent recognition of objects embedded in the same or similar noise, and how this interacts with target contrast. The authors approach this question using a combination of psychophysics, electroencephalography, and deep neural networks. They find better behavioural performance and multivariate decoding of stimuli preceded by noise, suggesting a beneficial effect of adaptation to noise. The neural network analysis seeks to provide a deeper explanation of the results by comparing how well different adaptation mechanisms capture the empirical behavioural results. The results show that models incorporating intrinsic adaptation mechanisms, such as additive suppression and divisive normalisation, capture the behavioural results better than those that incorporate recurrent interactions. The study has the potential to provide interesting insights into adaptation, but there are alternative (arguably more parsimonious) explanations for the results that have not been refuted (or even recognised) in the manuscript. If these confounds can be compellingly addressed, then I expect the results would be of interest to a broad range of readers.
The study uses a multi-modal approach, which provides a rich characterisation of the phenomenon. The methods are described clearly, and the accompanying code and data are made publicly available. The comparison between univariate and multivariate analyses is interesting, and the application of neural networks to distinguish between different models of adaptation seems quite promising.
There are several concerning confounding factors that need to be addressed before the results can be meaningfully interpreted. In particular, differences in behavioural accuracy may be explained by a simple change detection mechanism in the "same noise" condition, and temporal cuing by the "adaptor" stimulus may explain differences in reaction time. Similarly, interference between event-related potentials may explain the univariate EEG results, and biased decoder training may explain the multivariate results. Thus, it is currently unclear if any of the results reflect adaptation.
My main concerns relate to how adaptation is induced and how differences between conditions are interpreted. The adaptation period is only 1.5 s. Although brief adaptors (~1 s) can produce stimulus history effects, it is unclear whether these reflect the same mechanisms as those observed with standard, longer adaptation durations (e.g., 10-30 s). Prior EEG work on visual adaptation using longer adaptors has shown that feature-specific effects emerge very early (<100 ms) after test onset in both univariate and multivariate responses (Rideaux et al., 2023, PNAS). In contrast, the present study finds no difference between same and different adaptor conditions until much later (>300 ms). These later effects likely reflect cognitive processes such as template matching or decision-making, rather than sensory adaptation. Although early differences appear between blank and adaptor conditions, these could be explained by interactions between ERPs elicited by adaptor onset/offset and those elicited by the test stimulus; therefore, they cannot be attributed to adaptation. This contradicts the statement in the Discussion that "Our EEG measurements show clear evidence of repetition suppression, in the form of reduced responses to the repeated noise pattern early in time."
A second concern is the brief inter-stimulus interval. The adaptor is shown for 1.5 s, followed by only a 134 ms blank before the target. When the "adaptor" and test noise are identical, improved performance could simply arise from detecting the pixels that change, namely, those forming the target number. Such change detection does not require adaptation; even simple motion detector units would suffice. If the blank period were longer-beyond the temporal window of motion detectors-then improved performance would more convincingly reflect adaptation. Given the very short blank, however, a more parsimonious explanation for the behavioural effect in the same-noise condition is that change detection mechanisms isolate the target.
Differences between the blank and adaptor conditions may also be explained by temporal cueing. In the noise conditions, the noise reliably signals the upcoming target time, whereas the blank condition provides no such cue. Given the variable inter-trial interval and the brief target presentation, this temporal cue would strongly facilitate target perception. This account is consistent with the reaction time results: both adaptor conditions produce faster reaction times than the blank condition, but do not differ from each other.
The decoding analyses are also difficult to interpret, given the training-testing protocol. All trials from the three main conditions (blank, same, different) were used to train the classifier, and then held-out trials - all from one condition-were decoded. Because ERPs in the adaptor conditions differ substantially from those in the blank condition, and because there are twice as many adaptor trials, the classifier is biased toward patterns from the adaptor conditions and will naturally perform worse on blank trials. To compare decoding accuracy meaningfully across conditions, the classifier should be trained on a separate unbiased dataset (e.g., the "clean" data), or each condition should be trained and tested separately using cross-fold validation.
-
Reviewer #2 (Public review):
Summary:
Neurons adapt to prolonged or repeated sensory inputs. One function of such adaptation may be to save resources to avoid representing the same inputs over and over again. However, it has been hypothesized that adaptation could additionally help improve the representation of sensory stimuli, especially during difficult recognition scenarios. This study sheds light on this question and provides behavioral evidence for such enhancement. The behavioral results are interesting and compelling. The paper also includes scalp electroencephalographic (EEG) data, which are noisy but point toward similar conclusions. The authors finally implement a deep convolutional neural network (DCNN) with adaptation mechanisms, which nicely capture human behavior.
Strengths:
(1) The authors introduce an interesting …
Reviewer #2 (Public review):
Summary:
Neurons adapt to prolonged or repeated sensory inputs. One function of such adaptation may be to save resources to avoid representing the same inputs over and over again. However, it has been hypothesized that adaptation could additionally help improve the representation of sensory stimuli, especially during difficult recognition scenarios. This study sheds light on this question and provides behavioral evidence for such enhancement. The behavioral results are interesting and compelling. The paper also includes scalp electroencephalographic (EEG) data, which are noisy but point toward similar conclusions. The authors finally implement a deep convolutional neural network (DCNN) with adaptation mechanisms, which nicely capture human behavior.
Strengths:
(1) The authors introduce an interesting hypothesis about the role of adaptation in visual recognition.
(2) The authors present interesting and compelling behavioral data consistent with the hypothesis.
(3) The authors introduce a computational model that can capture mechanisms that can lead to adaptation, enhancing visual recognition.
Weaknesses:
(1) The main weakness is the scalp EEG data. As detailed below, the results are minimal at best and do not contribute to understanding the mechanisms of adaptation. The paper would be stronger without the EEG data.
(2) I wonder whether the hypothesis also holds with real-world objects in natural scenes, beyond the confines of MNIST digits.
-
Reviewer #3 (Public review):
Summary:
Brands and colleagues investigate how temporal adaptation can aid object recognition, and what neural computations may underlie these effects. They employed a previously published experimental paradigm to study how adaptation to temporally constant distractor input facilitates the recognition of a newly appearing target object. Specifically, they studied how this effect is modulated by the contrast of the target object.
They found that adaptation enhances the recognition of high-contrast objects more than that of low-contrast objects. This behavioral effect was mirrored by a larger effect of adaptation on the response to the high-contrast objects in relatively higher visual areas.
To investigate what neural computations can support this interaction, they implement several candidate neural mechanisms …
Reviewer #3 (Public review):
Summary:
Brands and colleagues investigate how temporal adaptation can aid object recognition, and what neural computations may underlie these effects. They employed a previously published experimental paradigm to study how adaptation to temporally constant distractor input facilitates the recognition of a newly appearing target object. Specifically, they studied how this effect is modulated by the contrast of the target object.
They found that adaptation enhances the recognition of high-contrast objects more than that of low-contrast objects. This behavioral effect was mirrored by a larger effect of adaptation on the response to the high-contrast objects in relatively higher visual areas.
To investigate what neural computations can support this interaction, they implement several candidate neural mechanisms in a deep convolutional neural network: additive suppression, divisive suppression, and lateral recurrence. The authors conclude that divisive and additive suppression, which are intrinsic to the neuron, best explain the interaction between contrast and adaptation in the human data. They further show that these mechanisms, and divisive suppression in particular, show increased robustness to spatial shifts of the adaptor stimulus, hinting and potential perceptual benefits.
Strengths:
(1) Overall, this is a well-written paper, supported by thorough analyses and illustrated with clear, well-designed figures that effectively show overall trends as well as data variance. The authors tell a compelling story while responsibly steering away from overreaching conclusions.
(2) What makes this paper stand out is its comprehensive approach to understanding the behavioral benefit of neural adaptation and its mechanistic underpinnings. The authors effectively achieve this through integrating new behavioral and neural data with simulations using neural network models.
(3) The findings convincingly demonstrate that neuronally intrinsic adaptation mechanisms are sufficient to explain the observed interaction between temporal adaptation, contrast, and object recognition. Furthermore, the paper highlights that these intrinsic mechanisms offer superior robustness compared to learned lateral recurrence mechanisms, which, while being more expressive, can also be more brittle.
Weaknesses:
While the results and conclusion are well supported, there were a few major points that need clarification for me.
(1) Divisive normalization
I was confused by the author's classification of divisive normalization as a neuronally intrinsic mechanism, that is, one that operates within a single neuron, independent of interactions with other neurons.
My understanding is that divisive normalization, as originally proposed by Heeger in the early nineties, describes a mechanism where neurons integrate pooled activity from neighboring cells to mutually inhibit one another. In this form, divisive normalization is fundamentally an interneuronal mechanism involving recurrence. Adding to the confusion, the authors highlight in the introduction their interest in divisive normalization for its relation to stimulus contrast, a relation likely linked to neuronal pooling.
However, my reading of the methods section (Equations 6 and 7) suggests the authors implemented only a temporal feedback component, leaving out the pooling across neurons (Equation 5). This distinction should be disambiguated early in the paper. I recommend choosing a less ambiguous term than "divisive normalization". Even "temporal divisive normalization" is still ambiguous, as lateral neuronal interactions are also inherently temporal.
(2) Parietal electrodes
The paper's adapter-specific effects are centered around the P9/P10 electrodes, which the authors identify as "parietal." However, it is unclear to me which part of the cortex drives these electrodes, particularly whether it is actually the parietal cortex. I am no expert in EEG, but based on the topomaps in Figures 4 and 5, it appears that these electrodes cover more posterior occipito-temporal regions rather than truly parietal regions. Given the central role of P9/P10 to the main findings, the paper would be significantly improved for non-EEG readers by clarifying which cortical regions are covered by these electrodes.
(3) Interpretation of non-significant statistical results
In some places, the authors attach relatively strong claims to non-significant statistical results. For example, in Figure 5D, they claim that there is no effect of contrast on occipital electrodes, based on a non-significant p-value. P-values do not quantify evidence for the null hypothesis, so the authors should be careful with such claims. In fact, Figure 5D shows such a clear negative slope, with variance comparable to Figure 5A, that I am surprised that the p-value for the slope of Figure 5D was in fact so large. A similar issue arises in the discussion for Figure 6, where the authors claim that the effect of contrast is adapter-specific. However, this claim is based on the observation that is significant for same-noise trials, but not for different-noise or blank trials. To statistically substantiate the claims that there is an adapter-specific effect, the authors should directly compare the slope for same-noise trials with the slope for different-noise/blank trials.
(4) The match between behavior and models
The authors' claim that models with intrinsic adaptation better match the interaction between contrast and temporal adaptation observed in human behavior is not fully substantiated. This conclusion appears to be based on a qualitative assessment of Figure 8, which, in my view, does not unambiguously rule out an interaction for lateral recurrence. Furthermore, a potential confounding factor is the ceiling effect that limits higher accuracy values. Indeed, conditions where the interaction was not/less (i.e., shorter time sequences and lateral inhibition) are also the conditions where accuracy values are closer to this ceiling, which may mask a potential interaction.
-