Competing neural representations of choice shape evidence accumulation in humans

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment

    This valuable study assesses how change in reward contingency in the environment affects the dynamics of a realistic large-scale neural circuit model, human choice behavior, and fMRI responses measured in the same individuals. It is not entirely clear which predictions of the neural circuit model go beyond previous work, the current results seem incomplete and could likely be substantially strengthened. This study could be of interest to scientists studying the neural and computational bases of adaptive behaviour.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Making adaptive choices in dynamic environments requires flexible decision policies. Previously, we showed how shifts in outcome contingency change the evidence accumulation process that determines decision policies. Using in silico experiments to generate predictions, here we show how the cortico-basal ganglia-thalamic (CBGT) circuits can feasibly implement shifts in decision policies. When action contingencies change, dopaminergic plasticity redirects the balance of power, both within and between action representations, to divert the flow of evidence from one option to another. When competition between action representations is highest, the rate of evidence accumulation is the lowest. This prediction was validated in in vivo experiments on human participants, using fMRI, which showed that (1) evoked hemodynamic responses can reliably predict trial-wise choices and (2) competition between action representations, measured using a classifier model, tracked with changes in the rate of evidence accumulation. These results paint a holistic picture of how CBGT circuits manage and adapt the evidence accumulation process in mammals.

Article activity feed

  1. Author Response

    Reviewer #1 (Public Review):

    1. The model's cortical neurons had no contralateral encoding, unlike their neuroimaging data.

    This is a common point of confusion. In fact, this comment has prompted us to clarify our modeling decisions. For the CBGT pathways, we use a simplified model of isolated "action channels" that represent unique actions without specifying the true laterality of representations in the brain. As long as relatively distinct representations compete, which is what we observed in our human neuroimaging data, and as long as the populations representing the action are unique, regardless of hemisphere, our model assumptions are applicable despite the complicated lateralization of unimanual actions in reality.

    We now specify this in the main text:

    “It is important to note that, for the sake of parsimony, we adopt a simple and canonical model of CBGT pathways, with action channels that are agnostic as to the location of representations (e.g., lateralization), simply assuming that actions have unique population-level representations.”

    1. Another concern with this work is that it was unclear why the spiking neuronal network model with so many model parameters was used to account for coarse-scale fMRI data - a simple firing-rate neural population model would perhaps do the work.

    We see how using a complex, biologically realistic neural network has arguable scientific value when comparisons are coarse and made against macroscopic hemodynamic responses. However, it does have clear value for setting the stage for future work that can unravel the nuances of the mechanism involved.

    To explain our rationale, we take an upward mapping perspective, where implementation-level models at lower levels represent the detailed biophysical properties of neurons and synapses, and models at higher levels represent the emergent properties of neural networks. This approach facilitates prediction at various levels of abstraction, including molecular, cellular, behavioral, and cognitive, by leveraging lower-level models to inform higher-level ones. For example, in other work, we are testing our model in mice using D1 and D2 optogenetic stimulation. We plan to use the same neural network to inform our predictions about these results. So, the complexity of the model does have a clear purpose for informing ongoing and future work by acting as a theoretical bridge between experiments across levels of analysis and spatiotemporal resolution. In our paper, the fMRI findings are compared with predicted dynamics at a common level of abstraction. Given the difference in resolution between these two approaches, our comparison is coarse.

    To the reviewer’s concern about the number of parameters in the model, we make sure to address the dimensionality of our model in our analysis approach in the Results section:

    “To test whether these shifts in v are driven by competition within and between action channels, we predicted the network's decision on each trial using a LASSO-PCR trained on the pre-decision firing rates of the network (see Measuring neural action representations). The choice of LASSO-PCR was based on prior work building reliable classifiers from whole-brain evoked responses that maximizes inferential utility (see Wager et al. 2011). The method is used when models are over-parameterized, as when there are more voxels than observations, relying on a combination of dimensionality reduction and sparsity constraints to find the true, effective complexity of a given model. While these are not considerations with our network model, they are with the human validation experiment that we describe next. Thus, we used the same classifier on our model as on our human participants to directly compare theoretical predictions and empirical observations.”

    1. Moreover, the activity dynamics of the fMRI were not shown. It would have been more rigorous to show the fMRI (BOLD) signals in different (particularly CBGT) brain regions and compare that with the CBGT model simulations.

    The timing of the trials and the autocorrelational structure of the BOLD response make such fine-grained analysis unproductive, as the entire trial is subsumed under a single evoked response. While we sympathize with this question, the limitations of the fMRI signal restrict our resolution for evaluating intra-trial dynamics. Our follow-up work with neurophysiological recordings in rodents may help address this. Given these limitations, we now show averaged node-by-node comparisons for the simulated and human data in Fig. 3 - Fig. Supp. 5.

    1. The association between classier uncertainty and drift rate (by participants) was an order of magnitude difference between the simulated and actual participants (compare Figure 2E with Figure 4B).

    You make a valid point about the difference in effect magnitude between the model and data. The greater effect observed in the simulated data is due to several factors: 1) simulated data is not affected by the same sources of noise as human data, 2) the model is not susceptible to non-task related variance, 3) the model was used to predict associations seen in humans, and fine-tuning the model using human data would result in circular inference, and 4) the simulations used only a single experimental condition with deterministic volatility, while human experiments varied the relative value of the two options and volatility, leading to increased variance in human responses. The goal was to compare the qualitative pattern of results, and the discrepancy in magnitude is addressed in the Discussion section of the revised manuscript:

    “Careful attention to the effect size of our correlations between channel competition and drift rate shows that the effect is substantially smaller in humans than in the model. This is not surprising and due to several factors. Firstly, the simulated data is not affected by the same sources of noise as the hemodynamic signal, whose responses can be greatly influenced by factors such as heterogeneity of cell populations and properties of underlying neurovascular coupling. Additionally, our model is not susceptible to non-task related variance, such as fatigue or lapses of attention, which the humans likely experienced. We could have fine tuned the model results based on the empirical human data, but that would contaminate the independence of our predictions. Finally, our simulations only used a single experimental condition, whereas human experiments varied the relative value of options and volatility, which led to more variance in human responses. Yet, despite these differences we see qualitative similarities in both the model and human results, providing confirmation of a key aspect of our theory.”

    1. There was also a weak effect on human reaction times (Supp. Fig. 2).

    Trial-by-trial reaction times are indeed noisy. However, our estimates rely on the distribution of reaction times, rather than trial-by-trial values.

    1. There were only 4 human participants that performed the experiment - the results would perhaps be better with more human participants.

    We see where this comment arises from and we are sympathetic to the initial thought, but we should point out that our experimental design mirrors the type used in non-human primate research: collect an entire experiment’s worth of data from a single participant and replicate the effects across new participants. We have a total of 2,700 trials per participant (for a total of 10,800 trials across all participants). Each participant has the equivalent number of trials as what would be conducted per experiment in typical single run or single session experiments with a sample of ~40 participants. Our statistical power was focused on within-subjects replication, meaning that each participant can be thought of as their own independent experiment, with sufficient statistical power to address our primary research hypothesis. Thus, in our experimental design, each run is an observation, as opposed to each participant as in typical fMRI experiments, and each participant is then considered a replication test of the other participants.

    We now emphasize the statistical power on a single-subject basis in the Results section:

    “Crucially, we designed this experiment such that each participant acted as an out-of-set replication test, having performed thousands of trials individually. Specifically, to ensure we had the statistical power to detect effects on a participant-by-participant basis, we collected an extensive data set comprising 2700 trials over 45 runs from nine separate imaging sessions for each of four participants. Consequently, we amassed a grand total of 36 hours of imaging data over all participants, which was used to evaluate the replicability of our findings at the participant-by-participant level. Therefore, our statistical analyses were able to estimate effects on a single-participant basis.”

    1. For such a complex biophysical computational model, there could perhaps have been more model predictions provided.

    Using a biologically realistic neural network may not be useful for finer-grained comparisons, but it can inform future work. By mapping upward from lower-level to higher-level models, we can predict emergent properties at different levels of abstraction. The model's complexity is necessary for informing ongoing and future work, such as testing the model in other organisms. While the comparison with fMRI findings is coarse, we address the dimensionality of our model in our analysis approach.

    Reviewer #2 (Public Review):

    1. In this paper, Bond et al. build on previous behavioral modeling of a reversal-learning task. They replicate some features of human behavior with a spiking neural network model of cortical basal ganglia thalamic circuits, and they link some of these same behavioral patterns to corresponding areas with BOLD fMRI. I applaud the authors for sharing this work as a preprint, and for publicly sharing the data and code.

    Thank you for your thoughtful comments on our work! We also appreciate your recognition of our efforts to openly share our data and code.

    1. While the spiking neural network model offers a helpful tool to complement behavior and neuroimaging, it is not very clear which predictions are specific to this model (and thus dissociate it from, or go beyond, previous work). Thus, the main strength of this work (combining behavior, brain, and in silico experiments) is not fully fleshed out and could be stronger in the conclusions we can draw from them.

    We agree that further exploration of the specific predictions that our spiking neural network model offers would be valuable in order to fully delineate its contribution to the field. In our current work, we link our simulated neural network dynamics with whole-brain hemodynamic data, which limits the temporal resolution and complexity of our comparisons. We recognize that a more detailed investigation of the unique contributions of our spiking neural network model would be an important next step in order to better understand the mechanisms underlying the observed behavioral patterns. Indeed – we are currently conducting follow-up work in mice to test finer-grained predictions of cellular dynamics.

    1. It would be helpful to know more about which features of the spiking NN model are crucial in precisely replicating the behavioral patterns of interest (and to be more precise in which behaviors are replicated from previous work with the same task, vs. which ones are newly acquired because the task has changed - or the spiking CBGT model has afforded new predictions for behavior). Throughout, I am wondering if the authors can compare their results to a reasonable 'null model' which can then be falsified (e.g. Palminteri et al. 2017, TICS); this would give more intuition about what it is about this new CBGT model that helps us predict behavior. The same question about model comparison holds for the behavior: beyond relying on DIC score differences, what features of behavior can and cannot be explained by the family of DDMs?

    You raise a crucial point. In our original manuscript, we only compared the single and pairwise variants of the HDDM model and a null model predicting no change in decision policy. The drift rate model best fit the data among these comparisons.

    However, our main claim relies on the link between neural data, behavior, and the underlying cognitive process. Previously, we did not test other variants of this central linking hypothesis. To address this, we tested an alternative linking hypothesis using boundary height instead of drift rate as our target variable. We found a null association with classifier uncertainty. This definitely provides a more rigorous test of our primary hypothesis, and we thank the reviewer for raising this point.

  2. eLife assessment

    This valuable study assesses how change in reward contingency in the environment affects the dynamics of a realistic large-scale neural circuit model, human choice behavior, and fMRI responses measured in the same individuals. It is not entirely clear which predictions of the neural circuit model go beyond previous work, the current results seem incomplete and could likely be substantially strengthened. This study could be of interest to scientists studying the neural and computational bases of adaptive behaviour.

  3. Reviewer #1 (Public Review):

    This manuscript made use of a biologically realistic neuronal network model of cortico-basal ganglia-thalamic (CBGT) circuits and a cognitive drift-diffusion model (DDM) to account for both behavioural and functional neuroimaging (fMRI) data and to understand how change in reward contingency in the environment can affect different decision dynamics. They found that the rate of evidence accumulation was most affected, allowing explorative behaviour with a lower drift rate during likely contingency change and exploitative behaviour with a higher drift rate when contingency was likely similar. The multi-pronged approach presented in the manuscript is commendable. The biophysical model was sufficiently realistic with varying ramping firing rates of spiny projection neurons linked to the varying drift rates in the DDM. However, there are a few concerns regarding this work.

    The model's cortical neurons had no contralateral encoding, unlike their neuroimaging data. Another concern with this work is that it was unclear why the spiking neuronal network model with so many model parameters was used to account for coarse-scale fMRI data - a simple firing-rate neural population model would perhaps do the work. Moreover, the activity dynamics of the fMRI were not shown. It would have been more rigorous to show the fMRI (BOLD) signals in different (particularly CBGT) brain regions and compare that with the CBGT model simulations.

    The association between classier uncertainty and drift rate (by participants) was an order of magnitude difference between the simulated and actual participants (compare Figure 2E with Figure 4B). There was also a weak effect on human reaction times (Supp. Fig. 2).

    There were only 4 human participants that performed the experiment - the results would perhaps be better with more human participants.

    For such a complex biophysical computational model, there could perhaps have been more model predictions provided.

    Overall, this work is interesting and could potentially be a good contribution in the area of computational modelling and neuroscience of adaptive choice behaviour.

  4. Reviewer #2 (Public Review):

    In this paper, Bond et al. build on previous behavioral modelling of a reversal-learning task. They replicate some features of human behavior with a spiking neural network model of cortical basal ganglia thalamic circuits, and they link some of these same behavioral patterns to corresponding areas with BOLD fMRI. I applaud the authors for sharing this work as a preprint, and for publicly sharing the data and code.

    While the spiking neural network model offers a helpful tool to complement behavior and neuroimaging, it is not very clear which predictions are specific to this model (and thus dissociate it from, or go beyond, previous work). Thus, the main strength of this work (combining behavior, brain, and in silico experiments) is not fully fleshed out and could be stronger in the conclusions we can draw from them.

    It would be helpful to know more about which features of the spiking NN model are crucial in precisely replicating the behavioral patterns of interest (and to be more precise in which behaviors are replicated from previous work with the same task, vs. which ones are newly acquired because the task has changed - or the spiking CBGT model has afforded new predictions for behavior). Throughout, I am wondering if the authors can compare their results to a reasonable 'null model' which can then be falsified (e.g. Palminteri et al. 2017, TICS); this would give more intuition about what it is about this new CBGT model that helps us predict behavior.

    The same question about model comparison holds for the behavior: beyond relying on DIC score differences, what features of behavior can and cannot be explained by the family of DDMs?