Coordinated multiplexing of information about separate objects in visual cortex

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    The authors report that neurons in V1 and V4 provide multiplex information of simultaneously presented objects. A combination of multi-single unit recordings, statistical modelling of neuronal responses and neuronal correlations analyses argues in favor of their claims. Pairs of neurons having similar object preferences tended to be positively correlated when both objects were presented, while pairs of neurons having different objects preferences tended to be negatively correlated. These patterns and others suggest that information about the two objects is multiplexed in time. There are, however, some unclear points that deserve discussion and further analysis that could more strongly support the claims.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #3 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article

Abstract

Sensory receptive fields are large enough that they can contain more than one perceptible stimulus. How, then, can the brain encode information about each of the stimuli that may be present at a given moment? We recently showed that when more than one stimulus is present, single neurons can fluctuate between coding one vs. the other(s) across some time period, suggesting a form of neural multiplexing of different stimuli (Caruso et al., 2018). Here, we investigate (a) whether such coding fluctuations occur in early visual cortical areas; (b) how coding fluctuations are coordinated across the neural population; and (c) how coordinated coding fluctuations depend on the parsing of stimuli into separate vs. fused objects. We found coding fluctuations do occur in macaque V1 but only when the two stimuli form separate objects. Such separate objects evoked a novel pattern of V1 spike count (‘noise’) correlations involving distinct distributions of positive and negative values. This bimodal correlation pattern was most pronounced among pairs of neurons showing the strongest evidence for coding fluctuations or multiplexing. Whether a given pair of neurons exhibited positive or negative correlations depended on whether the two neurons both responded better to the same object or had different object preferences. Distinct distributions of spike count correlations based on stimulus preferences were also seen in V4 for separate objects but not when two stimuli fused to form one object. These findings suggest multiple objects evoke different response dynamics than those evoked by single stimuli, lending support to the multiplexing hypothesis and suggesting a means by which information about multiple objects can be preserved despite the apparent coarseness of sensory coding.

Article activity feed

  1. Evaluation Summary:

    The authors report that neurons in V1 and V4 provide multiplex information of simultaneously presented objects. A combination of multi-single unit recordings, statistical modelling of neuronal responses and neuronal correlations analyses argues in favor of their claims. Pairs of neurons having similar object preferences tended to be positively correlated when both objects were presented, while pairs of neurons having different objects preferences tended to be negatively correlated. These patterns and others suggest that information about the two objects is multiplexed in time. There are, however, some unclear points that deserve discussion and further analysis that could more strongly support the claims.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #3 agreed to share their name with the authors.)

  2. Reviewer #1 (Public Review):

    The authors test whether neurons in V1 show "multiplexing", which means that when two stimuli A and B are presented inside their receptive fields (RFs), the neuronal response fluctuates across trials between coding one of the two, leading to a bimodal spike count histogram. They find evidence for this "mixture" model response in a subset of V1 neurons. They next test whether the spike count noise correlations (Rsc) vary between pairs of neurons that prefer the same versus different stimuli, and show that Rsc is positive for neurons that prefer the same stimulus but negative for neurons that prefer different stimuli.

    While this paper shows some intriguing results, I feel that there are a lot of open questions that need to be addressed before convincing evidence of multiplexing can be established. These points are discussed below:

    1. The best spike count model shown in Figure 2C is confusing. It seems that the number of "conditions" is a small fraction of the total number of conditions (and neurons?) that were tested. Supplementary Figure 1 provides more details (for example, the "mixture" corresponds to only 14% of total cases), but it is still confusing (for example, what does WinProb>Min mean?). From what I understood, the total number of neurons recorded for the Adjacent case in V1 is 1604, out of which 935 are Poisson-like with substantially separated means. Each one has 2 conditions (for the two directions), leading to 1870 conditions (perhaps a few less in case both conditions were not available). I think the authors should show 5 bar plots - the first one showing the fraction for which none of the models won by 2/3 probability, and then the remaining 4 ones. That way it is clear how many of the total cases show the "multiplexing" effect. I also think that it would be good to only consider neurons/conditions for which at least some minimum number of trials are available (a cutoff of say ~15) since the whole point is about finding a bimodal distribution for which enough trials are needed.

    2. More RF details need to be provided. What was the size of the V1 RFs? What was the eccentricity? Typically, the RF diameter in V1 at an eccentricity of ~3 degrees is no more than 1 degree. It is not enough to put 2 Gabors of size 1 degree each to fit inside the RF. How close were the Gabors? I am confused about the statement in the second paragraph of page 9 "typically only one of the two adjacent gratings was located within the RF" - I thought the whole point of multiplexing is that when both stimuli (A and B) are within the RF, the neuron nonetheless fires like A or B? The analysis should only be conducted for neurons for which both stimuli are inside the RF. When studying noise correlations, only pairs that have overlapping RFs such as both A and B and within the RFs of both neurons should be considered. The cortical magnification factor at ~3-degree eccentricity is 2-2.5mm/degree, so we expect the RF center to shift by at least 2 degrees from one end of the array to the other.

    3. Eye data analysis: I am afraid this could be a big confound. Removing trials that had microsaccades is not enough. Typically, in these tasks the fixation window is 1.5-2 degrees, so that if the monkey fixates on one corner in some trials and another corner in other trials (without making any microsaccades in either), the stimuli may nonetheless fall inside or away from the RFs, leading to differences in responses. This needs to be ruled out. I do not find the argument presented on pages 18 or 23 completely convincing, since the eye positions could be different for a single stimulus versus when both stimuli are presented. It is important to show that the eye positions are similar in "AB" trials for which the responses are "A" like versus "B" like, and these, in turn, are similar to when "A" and "B" are presented alone.

    4. Figures 5 and 6 show that the difference in noise correlations between the same preference and different preference neurons remains even for non-mixture type neurons. So, although the reason for the particular type of noise correlation was given for multiplexing neurons (Figure 3 and 4), it seems that the same pattern holds even for non-multiplexers. Although the absolute values are somewhat different across categories, one confound that still remains is that the noise correlations are typically dependent on signal correlation, but here the signal correlation is not computed (only responses to 2 stimuli are available). If there is any tuning data available for these recordings, it would be great to look at the noise correlations as a function of signal correlations for these different pairs. Another analysis of interest would be to check whether the difference in the noise correlation for simply "A"/"B" versus "AB" varies according to neuron pair category. Finally, since the authors mention in the Discussion that "correlations did not depend on whether the two units preferred the same stimulus or different", it would be nice to explicitly show that in figure 5C by showing the orange trace ("A" alone or "B" alone) for both same (green) and different (brown) pairs separately.

  3. Reviewer #2 (Public Review):

    I am confused about the nature of Poisson models. If I am correct, the Poisson(a+b) is the sum of the two Poisson(a) and Poisson(b), that is, Poisson(a+b) = Poisson(a) + Poisson(b). Then, the mixture and intermediate models are very similar, identical if a*lambda_A and (1-a)*lambda_B happen to be integer numbers.

    It is unclear why the 'outside' model predicts responses outside the range if neurons were to linearly sum the A and B responses.

    It is also unclear why the 'single' hypothesis would indicate a winner-take-all response. If I understand correctly, under this model, the response to A+B is either the rate A or B, but not the max between lambda_A and lambda_B. Also, this model could have given an extra free parameter to modulate its amplitude to the stimulus A+B.

    The concept of "coarse population coding" can be misleading, as actual population coding can represent stimulus with quite good precision. The authors refer to the broad tuning of single cells, but this does not readily correspond to coarse population coding. This could be clarified.

    As a complement to the correlation analysis, one could check whether, on a trial-by-trial basis, the neuronal response of a single neuron is closer to the A+B response average, or to either the A or B responses. This would clearly indicate that the response fluctuates between representing A or B, or simultaneously represents A+B. I am trying to understand why this is not one of the main analyses of the paper instead of the correlation analysis, which involves two neurons instead of one.

    In the discussion about noise correlations, the recent papers Nogueira et al., J Neuroscience, 2020 and Kafashan et al, Nat Comm, 2021 could be cited. Also, noise correlations can also be made time-dependent, so the distinction between the temporal correlation hypotheses and noise correlations might not be fundamental.

    It would be interesting to study the effect of contrast on the mixed responses. Is it reasonable to predict that with higher contrast the mixture responses would be more dominant than the single ones? This could be the case if the selection mechanism has a harder time suppressing one of the object responses. This would also predict that noise correlations will go down with higher contrast.

    What is the time bin size used for the analysis? Would the results be the same if one focuses on the early time responses or on the late time responses? At least from the units shown in Fig. 2, it looks that there is always an object response that is delayed respect to the other, so it would seem interesting to test noise correlations in those two temporal windows.

  4. Reviewer #3 (Public Review):

    How do cortical neurons represent multiple concurrent stimuli? Does the representation depend on the stimuli being segregated versus fused into a single object? This manuscript addresses those questions, focusing on the response statistics of single neurons and pairs, mostly in macaque primary visual cortex V1, to paired visual stimuli (gratings) that are either spatially separate (segregated) or superimposed (fused).

    Although V1 responses to combinations of gratings have been studied extensively, the authors offer an innovative perspective focusing on aspects of response statistics that have remained underexplored in past work. In my opinion, their findings are of broad interest because they challenge traditional understanding based on responses to 1-dimensional stimuli, and shed new light on the longstanding "binding" problem. In particular, leveraging methods and findings previously published by some of the authors in the auditory cortex, they ask whether the responses of neurons to simultaneous stimuli switch, from trial to trial, between the responses to either of the individual stimuli. This would correspond to a coding scheme in which individual neurons encode, at each moment in time, either one of the two stimuli. The authors point out that such a coding scheme is reminiscent of early ideas on neural coding of perceptually ambiguous stimuli, and is conceptually distinct (although not in contradiction) from divisive normalization, the more prominent existing framework to understand neural responses to composite stimuli. Having provided empirical evidence that this kind of coding appears only when simultaneous stimuli are segregated, and not when they are superimposed, the larger part of the manuscript then addresses the follow-up question: for segregated stimuli, do populations of V1 neurons all encode the same stimulus component at each point in time? Or are there subgroups of neurons that encode distinct stimulus components, thus allowing for both stimuli to be represented simultaneously? To address this question, the authors study how noise correlations (i.e. correlations in the response variability of pairs of neurons, to repeated presentations of the same visual input) depend on stimulus conditions and on the tuning preference of the neurons. Their main finding is that neurons with similar tuning (i.e. representing preferentially the same stimulus) are often positively correlated, i.e. when they switch from one stimulus to the other stimulus, they do so at the same time and therefore both neurons always tend to encode the same stimulus. Conversely, neurons with opposite tuning are often negatively correlated, also consistent with both neurons always encoding the same stimulus. However, across their datasets, the distribution of correlation values are broad enough to suggest a strategy where, at each moment in time, the majority of the neural population encodes preferentially one stimulus, but a minority of the population encodes the other stimulus thus preserving the ability to represent multiple stimuli simultaneously. Importantly, these patterns of single-neuron and pairwise response statistics (i.e. shifting between component stimuli in coordinated fashion) are absent when stimuli are superimposed or presented in isolation. Therefore, the results implicate the structure of cortical responses (beyond the much-studied average tuning and variability) in the process of object grouping and segregation.

    These results are potentially of very broad interest for the field, and the manuscript clearly places them in context. In addition, the analysis is sound (with a couple of minor details about the assumption of Poisson variability), the effect sizes are large and convincing, and the data will be useful to the community. However, in my view, an important methodological consideration deserves more scrutiny, namely to what extent the results can be a consequence of eye movements. And if so, what does that imply for the proposed coding scheme? Specifically, uncontrolled eye movements can effectively change the visual input from trial to trial, and so affect the structure of response variability. The authors state that they excluded trials in which microsaccades were detected, but more details on the detection of microsaccades and threshold values for inclusion (relative to stimulus and RF sizes) should be provided, given how central a role they might play. In particular, the authors state in Discussion that small residual eye movements would inflate response variability in all stimulus conditions. This is correct, but because of the stimulus design, it is possible (likely?) that the effects are quite different for segregated stimuli versus superimposed and single-stimulus conditions. Furthermore, the difference might be precisely in the direction of the effects reported. That is because segregated stimuli are spatially separate, and each stimulus only covers some receptive fields in the recording, it is possible that eye movements would bring inside the RF a different stimulus in each trial. In addition to producing bimodal response distributions for individual neurons, this would also induce positive correlations for pairs with the same preference and negative correlations for pairs with opposite preferences. On the other hand, in the superimposed condition where the stimulus is large enough to cover all RFs, at most eye movements would bring (part of) the stimulus inside versus outside the RF across trials, therefore contributing to positive noise correlations for all pairs (ie. to shifts from stimulus driven to spontaneous activity, in the extreme case).