Rate-distortion theory of neural coding and its implications for working memory

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    This paper is of potential interest to readers in the fields of working memory and neural coding. It presents a model of a neural circuit that learns to optimally represent its inputs subject to an information capacity limit and claims that this model can account for a range of empirical phenomena in the visual working memory literature. However, the fit to empirical data is qualitative and in some cases unconvincing, certain aspects of the neural model seem difficult to square with established neurophysiology, and there is insufficient conceptual or quantitative comparison with other models in the WM literature that seek to explain the same data.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Rate-distortion theory provides a powerful framework for understanding the nature of human memory by formalizing the relationship between information rate (the average number of bits per stimulus transmitted across the memory channel) and distortion (the cost of memory errors). Here, we show how this abstract computational-level framework can be realized by a model of neural population coding. The model reproduces key regularities of visual working memory, including some that were not previously explained by population coding models. We verify a novel prediction of the model by reanalyzing recordings of monkey prefrontal neurons during an oculomotor delayed response task.

Article activity feed

  1. Evaluation Summary:

    This paper is of potential interest to readers in the fields of working memory and neural coding. It presents a model of a neural circuit that learns to optimally represent its inputs subject to an information capacity limit and claims that this model can account for a range of empirical phenomena in the visual working memory literature. However, the fit to empirical data is qualitative and in some cases unconvincing, certain aspects of the neural model seem difficult to square with established neurophysiology, and there is insufficient conceptual or quantitative comparison with other models in the WM literature that seek to explain the same data.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

  2. Reviewer #1 (Public Review):

    This study presents a model of an idealized neural circuit that learns to minimize the distortion between its inputs and outputs subject to a capacity limit on the mutual information between them. With some further assumptions about the persistence of activity, this model is used to make predictions for patterns of working memory (WM) error that are compared to existing human behavioural data, and predictions for neural responses that are compared to existing primate electrophysiology data. The authors are to be commended for their ambitious and inventive approach and attempt to bridge different fields, however:

    - Some aspects seem improbable from a biological perspective e.g. it seems the tuning width in the model depends on the information rate R, which is predicted to change over short time intervals and with external factors such as the number of stimuli. Electrophysiological observations in general do not support these kinds of tuning changes.
    - The selection of studies/data for comparison with the model seems selective and overlooks well-established observations about WM that might challenge the model, e.g. it seems the model would predict that WM performance improves if stimuli are presented sequentially, but this is not the usual finding.
    - The match between model predictions and behavioural data is qualitative at best in a field where existing models can reproduce the same data with quantitative precision. Like previous studies that have assumed an information limit on WM (i.e. a fixed number of bits), the model struggles to reproduce the effects of set size.
    - The neural data provides a weak test of the model: the observation that performance correlates with activity level is common in most population models. The behavioural observation in primates that trials with large errors tend to be followed by trials with smaller errors could have a range of alternative explanations.
    - There are very few details of how the model outlined at a theoretical level was actually implemented to generate predictions for the different experiments.
    - The account of the model is difficult to follow with few signposts for readers who aren't well-versed in rate-distortion theory.

  3. Reviewer #2 (Public Review):

    The authors claim that typical trends in response statistics of subjects performing delayed estimation tasks can be described as the result of a population coding implementation of rate-distortion theory where the mutual information between stimulus and response fixes the capacity and circular error describes the distortion. Their results account for a number of replicated results in the working memory literature (set size, timing, serial dependence effects) and are easily interpretable in terms of simple parameterized models, especially focusing on optimizing a neural gain parameter for a Poisson spiking model. However, the paper as written overstates the physiologically-relevant predictions of the model however, since the mechanistic implementation of the rate-distortion solution is more simplistic than likely possible based on what we know about neural circuit mechanisms underlying working memory.

    Strengths: Rate-distortion is a well-defined and falsifiable theory of the origin of error in psychophysical tasks that the authors describe crisply and provide an interesting link to a corresponding population coding model. In doing so, they identify physiological parameters (neural gain) that can correspond to parameters of the rate-distortion optimization problem (priors→weightings; distortion penalty→neural gain). I also appreciate the comprehensive study of several different aspects of working memory limitations as an output of the model and qualitative comparisons with response data. Nice work also providing open software. A working code repository is linked and available with jupyter notebooks that run to produce all the paper figures.

    Weaknesses: The population coding model is simplistic, compared to the more likely and well-validated mechanisms of delay period encoding, for which there is extensive literature (e.g., Compte et al 2000; Wimmer et al 2014), which means care must be taken in over-interpreting its results. Delay period activity likely emerges from recurrent excitation, which is absent from the model. Along these lines, heterogeneity in neural activity is likely the effect but not the underlying cause (which is more likely synaptic in nature) of the serial and frequency bias result. Model parameter choices for comparisons with data are also not clear; the authors should say whether they fit parameters or picked them some other way.

  4. Reviewer #3 (Public Review):

    Rate-distortion theory is a mathematical framework that describes the optimal solution to the lossy data compression problem (optimizing a performance metric, subject to a cost or constraint on information rate). This framework has previously been utilized to understand human visual working memory at the abstract computational level. This paper seeks to extend prior work by implementing a detailed and biologically plausible neural population coding model of visual memory, that achieves the normative performance bounds predicted by rate-distortion theory. The model is shown to be able to reproduce previously described empirical phenomena, including set size effects, and serial order effects, among others, and is also applied to previously-collected neural data.

    Strengths

    • The model proposed by the authors is closely connected to a principled and well-understood theoretical framework (rate-distortion theory). Hence, the model can be seen as successfully bridging Marr's levels of analysis (computational, algorithm, and implementation-level).
    • The resulting model is also fairly parsimonious (e.g., it has no ad hoc components or mechanisms whose only seeming role is to account for specific empirical phenomena).

    Weaknesses

    • There are numerous existing computational models of visual working memory, including models not based on information-theoretic principles. While the authors show their model successfully reproduces a range of known behavioral phenomena, there are no formal model comparisons to alternative models.
    • How the model might scale to more complex visual information is largely unknown. For example, the model is designed to optimize a fixed cost function (cosine error for circular stimuli such as colors or oriented lines). It is not clear whether this is an appropriate cost function for visual memory for complex stimuli. Although the authors reference models that utilize variational autoencoders as a possible solution to this dilemma, it is not clear exactly how such models relate to the present work.

    Appraisal

    • The claims of the paper are relatively straightforward: The authors show that their model can be derived in a principled fashion from rate-distortion theory, and show that the resulting model successfully reproduces a range of documented empirical phenomena. Each of these claims is well-supported by the data.

    Potential impact

    • Visual working memory is an important field of study in neuroscience and psychology, as it bridges perception, learning, and memory. Many, many models have been proposed in this space. The current work is notable in that it offers a detailed neural implementation that retains a close connection to well-understood computational principles. In addition, the work expands upon recent and growing interest in "computational rationality", or the idea of systems that optimize performance subject to their resource or information processing constraints. Hence, the work is likely of interest to a wide audience in computational cognitive science.