Sensory-memory interactions via modular structure explain errors in visual working memory

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment

    In this important paper, the authors propose a computational model for understanding how the dynamics of neural representations may lead to specific patterns of errors as observed in working memory tasks. The paper provides solid evidence showing how a two-area model of sensory-memory interactions can account for the error patterns reported in orientation estimation tasks with delays. By integrating ideas from efficient coding and attractor networks, the resulting theoretical framework is appealing, and nicely captures some basic patterns of behavior data and the distributed nature of memory representation as reported in prior neurophysiological studies. The paper can be strengthened if (i) further analyses are conducted to deepen our understanding of the circuit mechanisms underlying the behavior effects; (ii) the necessity of the two-area network model is better justified; (iii) the nuanced aspects of the behavior that are not captured by the current model are discussed in more detail.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Errors in stimulus estimation reveal how stimulus representation changes during cognitive processes. Repulsive bias and minimum variance observed near cardinal axes are well-known error patterns typically associated with visual orientation perception. Recent experiments suggest that these errors continuously evolve during working memory, posing a challenge that neither static sensory models nor traditional memory models can address. Here, we demonstrate that these evolving errors, maintaining characteristic shapes, require network interaction between two distinct modules. Each module fulfills efficient sensory encoding and memory maintenance, which cannot be achieved simultaneously in a single-module network. The sensory module exhibits heterogeneous tuning with strong inhibitory modulation reflecting natural orientation statistics. While the memory module, operating alone, supports homogeneous representation via continuous attractor dynamics, the fully connected network forms discrete attractors with moderate drift speed and nonuniform diffusion processes. Together, our work underscores the significance of sensory-memory interaction in continuously shaping stimulus representation during working memory.

Article activity feed

  1. eLife assessment

    In this important paper, the authors propose a computational model for understanding how the dynamics of neural representations may lead to specific patterns of errors as observed in working memory tasks. The paper provides solid evidence showing how a two-area model of sensory-memory interactions can account for the error patterns reported in orientation estimation tasks with delays. By integrating ideas from efficient coding and attractor networks, the resulting theoretical framework is appealing, and nicely captures some basic patterns of behavior data and the distributed nature of memory representation as reported in prior neurophysiological studies. The paper can be strengthened if (i) further analyses are conducted to deepen our understanding of the circuit mechanisms underlying the behavior effects; (ii) the necessity of the two-area network model is better justified; (iii) the nuanced aspects of the behavior that are not captured by the current model are discussed in more detail.

  2. Reviewer #1 (Public Review):

    Summary:

    Working memory is imperfect - memories accrue errors over time and are biased towards certain identities. For example, previous work has shown memory for orientation is more accurate near the cardinal directions (i.e., variance in responses is smaller for horizontal and vertical stimuli) while being biased towards diagonal orientations (i.e., there is a repulsive bias away from horizontal and vertical stimuli). The magnitude of errors and biases increase the longer an item is held in working memory and when more items are held in working memory (i.e., working memory load is higher). Previous work has argued that biases and errors could be explained by increased perceptual acuity at cardinal directions. However, these models are constrained to sensory perception and do not explain how biases and errors increase over time in memory. The current manuscript builds on this work to show how a two-layer neural network could integrate errors and biases over a memory delay. In brief, the model includes a 'sensory' layer with heterogenous connections that lead to the repulsive bias and decreased error in the cardinal directions. This layer is then reciprocally connected with a classic ring attractor layer. Through their reciprocal interactions, the biases in the sensory layer are constantly integrated into the representation in memory. In this way, the model captures the distribution of biases and errors for different orientations that have been seen in behavior and their increasing magnitude with time. The authors compare the two-layer network to a simpler one-network model, showing that the one-model network is harder to tune and shows an attractive bias for memories that have lower error (which is incompatible with empirical results).

    Strengths:

    The manuscript provides a nice review of the dynamics of items in working memory, showing how errors and biases differ across stimulus space. The two-layer neural network model is able to capture the behavioral effects as well as relate to neurophysiological observations that memory representations are distributed across the sensory cortex and prefrontal cortex.

    The authors use multiple approaches to understand how the network produces the observed results. For example, analyzing the dynamics of memories in the low-dimensional representational space of the networks provides the reader with an intuition for the observed effects.

    As a point of comparison with the two-layer network, the authors construct a heterogenous one-layer network (analogous to a single memory network with embedded biases). They argue that such a network is incapable of capturing the observed behavioral effects but could potentially explain biases and noise levels in other sensory domains where attractive biases have lower errors (e.g., color).

    The authors show how changes in the strength of Hebbian learning of excitatory and inhibitory synapses can change network behavior. This argues for relatively stronger learning in inhibitory synapses, an interesting prediction.

    The manuscript is well-written. In particular, the figures are well done and nicely schematize the model and the results.

    Weaknesses:

    Despite its strengths, the manuscript does have some weaknesses.

    First, as far as we can tell, behavioral data is only presented in schematic form. This means some of the nuances of the effects are lost. It also means that the model is not directly capturing behavioral effects. Therefore, while providing insight into the general phenomenon, the current manuscript may be missing some important aspects of the data.

    Relatedly, the models are not directly fit to behavioral data. This makes it hard for the authors to exclude the possibility that there is a single network model that could capture the behavioral effects. In other words, it is hard to support the authors' conclusion that "....these evolving errors...require network interaction between two distinct modules." (from the abstract, but similar comments are made throughout the manuscript). Such a strong claim needs stronger evidence than what is presented. Fitting to behavioral data could allow the authors to explore the full parameter space for both the one-layer and two-layer network architectures.

    In addition, directly comparing the ability of different model architectures to fit behavioral data would allow for quantitative comparison between models. Such quantitative comparisons are currently missing from the manuscript.

    To help broaden the impact of the paper, it would be helpful if the authors provided insight into how the observed behavioral biases and/or network structures influence cognition. For example, previous work has argued that biases may counteract noise, leading to decreased variance at certain locations. Is there a similar normative explanation for why the brain would have repulsive biases away from commonly occurring stimuli? Are they simply a consequence of improved memory accuracy? Why isn't this seen for all stimulus domains?

    Previous work has found both diffusive noise and biases increase with the number of items in working memory. It isn't clear how the current model would capture these effects. The authors do note this limitation in the Discussion, but it remains unclear how the current model can be generalized to a multi-item case.

    The role of the ring attractor memory network isn't completely clear. There is noise added in this stage, but how is this different from the noise added at the sensory stage? Shouldn't these be additive? Is the noise necessary? Similarly, it isn't clear whether the memory network is necessary - can it be replaced by autapses (self-connections) in the sensory network to stabilize its representation? In short, it would be helpful for the authors to provide an intuition for why the addition of the memory network facilitates the repulsive bias.

    Overall:

    Overall, the manuscript was successful in building a model that captured the biases and noise observed in working memory. This work complements previous studies that have viewed these effects through the lens of optimal coding, extending these models to explain the effects of time in memory. In addition, the two-layer network architecture extends previous work with similar architectures, adding further support to the distributed nature of working memory representations.

  3. Reviewer #2 (Public Review):

    In this manuscript, Yang et al. present a modeling framework to understand the pattern of response biases and variance observed in delayed-response orientation estimation tasks. They combine a series of modeling approaches to show that coupled sensory-memory networks are in a better position than single-area models to support experimentally observed delay-dependent response bias and variance in cardinal compared to oblique orientations. These errors can emerge from a population-code approach that implements efficient coding and Bayesian inference principles and is coupled to a memory module that introduces random maintenance errors. A biological implementation of such operation is found when coupling two neural network modules, a sensory module with connectivity inhomogeneities that reflect environment priors, and a memory module with strong homogeneous connectivity that sustains continuous ring attractor function. Comparison with single-network solutions that combine both connectivity inhomogeneities and memory attractors shows that two-area models can more easily reproduce the patterns of errors observed experimentally. This, the authors take as evidence that a sensory-memory network is necessary, but I am not convinced about the evidence in support of this "necessity" condition. A more in-depth understanding of the mechanisms operating in these models would be necessary to make this point clear.

    Strengths:

    The model provides an integration of two modeling approaches to the computational bases of behavioral biases: one based on Bayesian and efficient coding principles, and one based on attractor dynamics. These two perspectives are not usually integrated consistently in existing studies, which this manuscript beautifully achieves. This is a conceptual advancement, especially because it brings together the perceptual and memory components of common laboratory tasks.

    The proposed two-area model provides a biologically plausible implementation of efficient coding and Bayesian inference principles, which interact seamlessly with a memory buffer to produce a complex pattern of delay-dependent response errors. No previous model had achieved this.

    Weaknesses:

    The correspondence between the various computational models is not fully disclosed. It is not easy to see this correspondence because the network function is illustrated with different representations for different models and the correspondence between components of the various models is not specified. For instance, Figure 1 shows that a specific pattern of noise is required in the low-dimensional attractor model, but in the next model in Figure 2, the memory noise is uniform for all stimuli. How do these two models integrate? What element in the population-code model of Figure 2 plays the role of the inhomogeneous noise of Figure 1? Also, the Bayesian model of Figure 2 is illustrated with population responses for different stimuli and delays, while the attractor models of Figures 3 and 4 are illustrated with neuronal tuning curves but not population activity. In addition, error variance in the Bayesian model appears to be already higher for oblique orientations in the first iteration whereas it is only first shown one second into the delay for the attractor model in Figure 4. It is thus unclear whether variance inhomogeneities appear already at the perceptual stage in the attractor model, as it does in the population-code model. Of course, correspondences do not need to be perfect, but the reader does not know right now how far the correspondence between these models goes.

    The manuscript does not identify the mechanistic origin in the model of Figure 4 of the specific noise pattern that is required for appropriate network function (with higher noise variance at oblique orientations). This mechanism appears critical, so it would be important to know what it is and how it can be regulated. In particular, it would be interesting to know if the specific choice of Poisson noise in Equation (3) is important. Tuning curves in Figure 4 indicate that population activity for oblique stimuli will have higher rates than for cardinal stimuli and thus induce a larger variance of injected noise in oblique orientations, based on this Poisson-noise assumption. If this explanation holds, one wonders if network inhomogeneities could be included (for instance in neural excitability) to induce higher firing rates in the cardinal/oblique orientations so as to change noise inhomogeneities independently of the bias and thus control more closely the specific pattern of errors observed, possibly within a single memory network.

    The main conclusion of the manuscript, that the observed patterns of errors "require network interaction between two distinct modules" is not convincingly shown. The analyses show that there is a quantitative but not a qualitative difference between the dynamics of the single memory area compared to the sensory-memory two-area network, for specific implementations of these models (Figure 7 - Figure Supplement 1). There is no principled reasoning that demonstrates that the required patterns of response errors cannot be obtained from a different memory model on its own. Also, since the necessity of the two-area configuration is highlighted as the main conclusion of the manuscript, it is inconvenient that the figure that carefully compares these conditions is in the Supplementary Material.

    The proposed model has stronger feedback than feedforward connections between the sensory and memory modules. This is not a common assumption when thinking about hierarchical processing in the brain, and it is not discussed in the manuscript.

  4. Reviewer #3 (Public Review):

    Summary:

    The present study proposes a neural circuit model consisting of coupled sensory and memory networks to explain the circuit mechanism of the cardinal effect in orientation perception which is characterized by the bias towards the oblique orientation and the largest variance at the oblique orientation.

    Strengths:

    The authors have done numerical simulations and preliminary analysis of the neural circuit model to show the model successfully reproduces the cardinal effect. And the paper is well-written overall. As far as I know, most of the studies on the cardinal effect are at the level of statistical models, and the current study provides one possibility of how neural circuit models reproduce such an effect.

    Weaknesses:

    There are no major weaknesses and flaws in the present study, although I suggest the author conduct further analysis to deepen our understanding of the circuit mechanism of the cardinal effects. Please find my recommendations for concrete comments.