ACC neural ensemble dynamics are structured by strategy prevalence

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment

    This manuscript posits a novel role for the anterior cingulate cortex (ACC) in coding for sequential action strategies and the prevalence of each strategy. These findings provide important insight into ACC function and will therefore be of broad interest within the field of cognitive neuroscience. The evidence supporting the primary hypothesis is currently incomplete but could be rendered convincing with some further effort to rule out potential confounding factors.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Medial frontal cortical areas are thought to play a critical role in the brain’s ability to flexibly deploy strategies that are effective in complex settings, yet the underlying circuit computations remain unclear. Here, by examining neural ensemble activity in male rats that sample different strategies in a self-guided search for latent task structure, we observe robust tracking during strategy execution of a summary statistic for that strategy in recent behavioral history by the anterior cingulate cortex (ACC), especially by an area homologous to primate area 32D. Using the simplest summary statistic – strategy prevalence in the last 20 choices – we find that its encoding in the ACC during strategy execution is wide-scale, independent of reward delivery, and persists through a substantial ensemble reorganization that accompanies changes in global context. We further demonstrate that the tracking of reward by the ACC ensemble is also strategy-specific, but that reward prevalence is insufficient to explain the observed activity modulation during strategy execution. Our findings argue that ACC ensemble dynamics is structured by a summary statistic of recent behavioral choices, raising the possibility that ACC plays a role in estimating – through statistical learning – which actions promote the occurrence of events in the environment.

Article activity feed

  1. Author Response

    Reviewer #1 (Public Review):

    “In analyzing neural activity accompanying the behavioral persistence of the dominant sequence after a block change, the authors find that the ACC ensemble firing pattern is closer to the original dominant sequence pattern during reinforcement and less like this pattern during exploration… As time, and trials, progress the rat is approaching the point at which it explores another strategy. The authors find strengthened "prevalence" encoding with increasing sequence repetition, but if this parameter is related to behavioral change/flexibility, this was not clear to me. Might there be something unique about the last trials in a tail "predicting" an upcoming switch? Can the authors please expand? Relatedly, if the prediction of upcoming behavioral change is not observed in the neural activity from sequence steps 2-6, it is notable that these are the steps 'within' the sequence, that leaves out the initiation (first center poke) and termination (reward/reward omission). Thus one could imagine this information is "missed" in the current analysis given that both the reward period and the initiation of a trial at the center are not analyzed. This does lead me to suggest a softening of some claims made of identifying "unifying principles" of ACC function, as the authors state, based on the analyses included in the current report, since the neural activity related to the full unit of behavior is not considered. (I appreciate the motivation behind this focus on within-sequence behavior - the wish to compare time periods with similar movement parameters .)

    We apologize for the confusion; while the sequence prevalence itself tends to be high for ‘dominant tails’, we do not claim that the fit of the prevalence model is better at those sequence instances. We do share the interest in linking prevalence encoding to behavioral adaptation as well as the Reviewer’s intuition that block transitions should be among the epochs where strategy prevalence is tracked particularly well. And indeed, we had spent a considerable amount of time thinking about whether we can identify and interpret periods during the session where our prevalence model fits better or worse. Two arguments convinced us to abandon that direction: a technical one and a conceptual one. The technical argument is that when the explanatory power of a variable is limited, regression residuals are proportional to the variable itself. Thus, any meaningful comparison of the model’s fit would have had to be done for periods where strategy prevalence is within a similar range. The conceptual argument is even more disarming: imagine we do identify a putative session epoch where the model fits worse. While it is possible that it truly means that the animal tracks the details of how much he has pursued this strategy in recent past less, it is equally possible that we were simply off in selecting the specific window over which the prevalence signal is estimated, the exact behavioral statistic tracked, or the exact form of the dependence between that statistic and neural activity. We certainly do see changes leading up to behavioral switches at block transitions – something we plan to elaborate on in a subsequent paper – but whether those are related to prevalence tracking is something we believe is hard to crack.

  2. eLife assessment

    This manuscript posits a novel role for the anterior cingulate cortex (ACC) in coding for sequential action strategies and the prevalence of each strategy. These findings provide important insight into ACC function and will therefore be of broad interest within the field of cognitive neuroscience. The evidence supporting the primary hypothesis is currently incomplete but could be rendered convincing with some further effort to rule out potential confounding factors.

  3. Reviewer #1 (Public Review):

    This manuscript by Proskurin, Manakov, and Karpova, posits a unique role for the anterior cingulate cortex (ACC) in the flexible control of learned sequences of motor actions. The authors marshall evidence from behavioral-electrophysiological analyses in support of two major claims: 1) that action encoding by ACC ensembles tracks 1) the current "context", i.e., which behavioral sequence is rewarded, and 2) the "prevalence", i.e., number of repetitions of one specific sequence. An important aspect of this later point is that the authors propose prevalence encoding is not strictly dependent on trial-by-trial reward receipt.

    In this work, the authors wish to focus on self-initiated behavior when the correct behavioral sequence, out of four or fewer (mostly two it appears), changes across blocks in an unsignaled manner. Rats learn to enter a left and right nose poke in a sequence of three responses, with a required entry into a central port prior to each intra-sequence response, with correct sequence completion reinforced by a sucrose delivery in the relevant side nose poke port. Extracellular spike activity is acquired from well-trained rats performing this task. The authors' analyses of the behavior of well-trained rats show rats adjust their behavior when the block switches to a different one of the sequences in the known 'library'. The rats also perform non-reinforced responses/sequences within a given block which the authors suggest is exploration likely not triggered by changes in reinforcement in contrast to behavior change after a block switch.

    The authors next provide a very rich set of analyses to examine the encoding of responses and sequences by ACC neural activity. Overall, these data provide intriguing support for ACC's integral contributions to flexible behavioral control. However, some of the individual analyses are a bit difficult to follow and could be clarified with greater detail within the results section of the paper, permitting an easier evaluation of the quality of the supporting data. Second, there are some proposals that could be strengthened by fuller analysis, in particular the authors' suggestion that "prevalence" encoding is distinct from reward encoding and/or is not impacted by reward presence or omission. Given the likely rich data set in hand, the authors could do more to demonstrate how "prevalence" encoding interacts with reinforcement parameters or perhaps be more specific in their word choice. More importantly, I was left unclear on how "prevalence" encoding intersects with the decision to repeat the same behavioral sequence on the next trial or not. These issues aside, this work provides further information on the physiology of ACC during flexible behavior and will add importance to this field.

    Below are specific issues:

    1. Some greater attention to the behavioral parameters could be helpful, especially regarding the impact of reward rate on behavior. For example, looking at some of the figures of individual rat behavior, exploratory sequences seemed triggered by reward omission. Is this just a chance for the examples chosen or is there something systematic here? Upon block switch, how exactly does the switch in sequences emitted by the rat track with reinforcement history? The authors mention that reinforcement probability differed across sessions, and one would thus expect switching behavior would as well. Because of the interesting existence of sometimes quite long 'tails' of performance of the original sequence after a block switch, I am wondering how the length of such tails relates to reinforcement rate parameters.

    2. The authors provide strong data indicating that a given L or R response is associated with distinct ACC activity depending on which sequence that response is embedded within, a finding reminiscent of other reports in multiple brain regions. While not a criticism per se, I was interested in the center port responses, also embedded within unique sequences, yet never preceding reward. A key difference in the performance of a given R or L response is that it is sometimes the terminal response, and thus the rat knows a given R or L response to be sometimes reinforced in one of the contexts, but not the other, in each of these comparisons. I wonder if there was an opportunity to cleanly demonstrate the context dependence of a given individual action by comparing center port responses across distinct sequences.

    3. In analyzing neural activity accompanying the behavioral persistence of the dominant sequence after a block change, the authors find that the ACC ensemble firing pattern is closer to the original dominant sequence pattern during reinforcement and less like this pattern during exploration. This makes sense and must be the case, as, in the example shown in the figure, the rat does not "know" the block has switched since no reward has yet been delivered that would signal that switch. (As an aside, it would be interesting to know, given a specific reward schedule in a given session, what would be the maximum number of unrewarded trials within the block, and how might that impact the performance/reward expectation during the tails?)
    As time, and trials, progress the rat is approaching the point at which it explores another strategy. The authors find strengthened "prevalence" encoding with increasing sequence repetition, but if this parameter is related to behavioral change/flexibility, this was not clear to me. Might there be something unique about the last trials in a tail "predicting" an upcoming switch? Can the authors please expand?
    Relatedly, if the prediction of upcoming behavioral change is not observed in the neural activity from sequence steps 2-6, it is notable that these are the steps 'within' the sequence, that leaves out the initiation (first center poke) and termination (reward/reward omission). Thus one could imagine this information is "missed" in the current analysis given that both the reward period and the initiation of a trial at the center are not analyzed. This does lead me to suggest a softening of some claims made of identifying "unifying principles" of ACC function, as the authors state, based on the analyses included in the current report, since the neural activity related to the full unit of behavior is not considered. (I appreciate the motivation behind this focus on within-sequence behavior - the wish to compare time periods with similar movement parameters .)

    4. The variance in neural activity explained by the prevalence models is on average quite low. However, the authors find that the variance explained differs quite dramatically by anatomical coordinate within ACC. Would it make sense to focus the control analyses (vigor, reward history, and so on) on those sessions/ensembles with greater variance explained, ie, perhaps there might be greater sensitivity to detecting interactions among variables within ensembles recorded more rostrally?

    5. A very intriguing aspect of this work is the position that (from the abstract): "Prevalence encoding in the ACC is ...independent of reward delivery." This is a novel aspect of the current work. However, I am wondering if the authors can refine and expand upon this. I find it difficult to disentangle prevalence encoding and impacts of reward in the way the data and interpretation are presented in some areas of the text. While neural encoding may not reflect trial-by-trial reward receipt, clearly the rat's decision to repeat a given sequence or initiate a new sequence is impacted by reinforcement parameters and reward expectation. Thus being very exact in the interpretation would be helpful.

  4. Reviewer #2 (Public Review):

    Correctly keeping track of behavioral strategies allows for flexible context-appropriate behaviors. Several brain regions, including the anterior cingulate cortex (ACC), have been proposed to be involved in this process. But its neural correlates and computation principles still need to be uncovered, especially at the neural population level.

    In this manuscript, to find such neural correlates, the authors create a behavioral task in which rats must discover a strategy and use it to obtain a reward. Specifically, the authors train rats to perform a self-initiated nose-poking task in which, within every 250-500 trials, rats performing a target '3-step action sequence' leads to sucrose reward delivery. The target action sequence is viewed as 'latent' because it is un-signaled, and rats have to infer it based on past choices and outcomes. Behavioral analyses show that rats' actions comply with the target action sequence after training. However, even at the expert level, rats sometimes show deviations from choosing the target action sequence and instead choose the alternative action sequence. Based on several criteria, the authors identify most of these deviations to reflect an 'exploratory' nature of the rats' behavior in this task. Tetrode recordings in these trained rats show that most ACC neurons encode 'strategy prevalence,' basically, a signal telling which strategy dominates rats' sequential nose-poking actions. Such representation is not restricted to ACC and is also found in M2 and SMC, though with less pronounced correlations. Beyond encoding such a 'global' strategy, the ACC neurons also show activity related to 'local' fluctuations in rats' choices, which the authors argue cannot be explained by several commonly considered behavioral variables, including movement kinematics and vigor and reward expectation. Interestingly, the strategy prevalence is decodable across sequence execution time with a weight-fixed decoder, even though most neurons show transient selectivity to strategy prevalence at the single-cell level, showing the importance of neural population representation.

    The behavioral task design is complicated yet appealing. In this task, rats must constantly adjust their behavioral strategy to align with the un-signaled target sequence changes. The task design and the following neural data analyses represent a technical strength of the current study. After controlling for many confounding factors, the ACC neural activities distinguish between 'dominant' vs. 'exploratory' sequence prevalence and contain the specific sequence identities. Building upon their previous work, in this study, the authors reveal more detailed neural dynamics mechanisms for the involvement of ACC in signaling subjective behavioral strategy other than the actual task rule. These findings are conceptually important and would greatly draw the attention of many interested in the neural mechanisms of higher-order brain functions at the systems level.

    The primary weakness of the study, however, is that the behavioral and data analyses cannot eliminate all the confounding factors, although, in certain conditions, such influences can be minimized to an acceptable level. That said, the current analyses only partially support the authors' conclusions. Nevertheless, despite these limitations, this study aiming at isolating neural correlates of the 'strategy prevalence' has substantial value in its methodology and proposed hypothesis on ACC behavioral functions and would likely have a significant impact on the field. The innovative data analysis methods implemented in the study can be helpful for related behavioral electrophysiological and imaging studies. Besides, mapping the putative SMC and ACC area to primate SMC and 32D helps to connect the research in rodents and primates.

  5. Reviewer #3 (Public Review):

    Proskurin and colleagues aim to test if neurons in rat medial prefrontal cortex encode strategy in a serial choice task. They recorded neural activity as rats performed a nose-poke task for reward. Rats were required to discover, without explicit instruction, which of the possible 3-action sequences were rewarded. One of several possible sequences remained the target (thereby triggering reward delivery) over a block of trials, before switching to an alternate sequence. The authors then used analysis of single neurons and ensembles of neural activity to determine if neural activity reflected whether a sequence was the dominant strategy in a block or an explorative test.

    The strengths of the work include the timely and important hypothesis, and the use of appropriate methodologies to test it.

    I commend the authors for endeavouring to tackle this challenging topic. The weaknesses of the work derive from the difficulties of studying such a challenging topic. It is extremely difficult to ascribe the variance of neural activity to a latent variable such as strategy, particularly in freely-moving animals motivated by reward. This is because of the plethora of potential confounders. For instance, the authors compare the encoding of one action (L) in two sequences (RLL and LLR). However, the analyzed action occurs in different local contexts. In the first, it is the middle action, and in the second it is the first action following a reward omission. Even though the reward is withheld, the rat presumably has some reward expectation. Because strategy is a latent variable, the evidentiary threshold is high, and alternate explanations of neural variance needed to be rejected. This is particularly important given the neural structures under investigation are involved in regulating motor output, suggesting that differences in response speed, body position, and related variables may explain considerable variance in neural activity. Other potential explanatory variables are rule certainty, position in the sequence, side chosen, preceding choice, and changes in firing rate as the session progresses due to changes in motivation, fatigue, or drift in the signal. The authors attempt to address some of these, but this is done in a very condensed presentation near the end of the results. This needs to be unpacked (and visualized) in order for readers to evaluate whether the strategy is the most likely explanation of neural variance, as proposed by the authors. The paper would benefit from analyses, such as multiple regression over all possible predictive variables, to evaluate the relative amount of neural signal variance attributable to strategy dominance compared to other information.

    An additional weakness of the manuscript is the absence of some fundamental checks on data quality, particularly for bias in animal behavior, stability of neural activity during sessions, and bias in data sampling for classifier sampling.

    In sum, the experimental methodology appears sufficient to address the authors' aim of evaluating the encoding of strategy by neurons in the medial prefrontal cortex. Alternate interpretations of the data, however, are not sufficiently ruled out by the analysis to strongly support the claim that the exploration of strategy is the primary driver of altered neural signalling. The data and methodologies are valuable to behavioral and systems neuroscientists. The task and the finding that rats appear to spontaneously explore alternate strategies are elegant, and a very nice paradigm for studying the neural mechanisms of strategy shifting. Moreover, the finding that many neurons in the medial prefrontal cortex change their firing rate during the task is an important new contribution. Future analysis and experiments will undoubtedly better resolve the information encoded by these changes in firing rate.