A reservoir of timescales emerges in recurrent circuits with heterogeneous neural assemblies

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment

    This fundamental work uses computational network models to suggest a possible origin of the wide range of time scales observed in cortical activity. This claim is supported by convincing evidence based on comparisons between mathematical theory, simulations of spiking network models, and analysis of recordings from the orbitofrontal cortex. This manuscript will be of interest to the broad community of systems and computational neuroscience.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

The temporal activity of many physical and biological systems, from complex networks to neural circuits, exhibits fluctuations simultaneously varying over a large range of timescales. Long-tailed distributions of intrinsic timescales have been observed across neurons simultaneously recorded within the same cortical circuit. The mechanisms leading to this striking temporal heterogeneity are yet unknown. Here, we show that neural circuits, endowed with heterogeneous neural assemblies of different sizes, naturally generate multiple timescales of activity spanning several orders of magnitude. We develop an analytical theory using rate networks, supported by simulations of spiking networks with cell-type specific connectivity, to explain how neural timescales depend on assembly size and show that our model can naturally explain the long-tailed timescale distribution observed in the awake primate cortex. When driving recurrent networks of heterogeneous neural assemblies by a time-dependent broadband input, we found that large and small assemblies preferentially entrain slow and fast spectral components of the input, respectively. Our results suggest that heterogeneous assemblies can provide a biologically plausible mechanism for neural circuits to demix complex temporal input signals by transforming temporal into spatial neural codes via frequency-selective neural assemblies.

Article activity feed

  1. Author Response

    Reviewer 1 Public Review

    The authors aim to theoretically explain the wide range of time scales observed in cortical circuits in the brain – a fundamental problem in theoretical neuroscience. They propose that the variety of time scales arises in recurrent neural networks with heterogeneous units that represent neuronal assemblies of different sizes that transition through sequences of high- and low-activity metastable states. When transitions are driven by intrinsically generated noise, the heterogeneity leads to a wide range of escape times (and hence time scales) across units. As a mathematically tractable model, they consider a recurrent network of heterogeneous bistable rate units in the chaotic regime. The model is an extension of the previous model by Stern et al (Phys. Rev. E, 2014) to the case of heterogeneous self-coupling parameters. Biologically, this heterogeneous parameter is interpreted as different assembly sizes. The chaoticity acts as intrinsically generated noise-driving transitions between bistable states with escape times that are indeed widely distributed because of the heterogeneity. The distribution is successfully fitted to experimental data. Using previous dynamic mean-field theory, the self-consistent auto-correlation function of the driving noise in the mean-field model is computed (I guess numerically). This leaves the theoretical problem of calculating escape times in the presence of colored noise, which is solved using the unified colored-noise approximation (UCNA). They find that the log of the correlation time of a given unit increases quadratically with the self-coupling strength of that unit, which nicely explains the distribution of time scales over several orders of magnitude. As a biologically plausible implementationof the theory, they consider a spiking neural network with clustered connectivity and heterogeneous cluster sizes (extension of the previous model by Mazzucato et al. J Neurosci 2015). Simulations of this model also exhibit a quadratic increase in the log dwell time with cluster size. Finally, the authors demonstrate that heterogeneous assemblies might be useful to differentially transmit different frequency components of a broadband stimulus through different assemblies because the assembly size modulates the gain.

    I found the paper conceptually interesting and original, especially the analytical part on estimating the mean escape times in the rate network using the idea of probe units and the UCNA. It is a nice demonstration of how chaotic activity serves as noise-driving metastable activity. Calculating the typical time scales of such metastable activity is a hard theoretical problem, for which the authors made considerable advancement. The conclusions of this paper are mostly well supportedby simulations and mathematical analysis, but some aspects need to be clarified and extended, especially concerning the biological plausibility of the rate network model and its relation to the spiking neural network model as well as the analytical calculation of the mean dwell time.

    Question 1a. The theory is based on a somewhat unbiological network of bistable rate units. It seems to only loosely apply to the implementation with a spiking neural network with clustered architecture, which is used as a biological justification of the rate model. In the spiking model, a wide distribution of time scales also emerges as a consequence of noise-induced escapes in combination with heterogeneity. Apart from this analogy, however, the mechanisms for metastability seem to be quite different: firstly, the functional units in the spiking neural network are presumably not bistable themselves but multistability only emerges as a network effect, i.e. from the interaction with other assemblies and inhibitory neurons. (This difference yields anti-correlations between assemblies in the spiking model, in marked contrast to the independence of bistable rate units (if N is large).) Secondly, transitions between metastable states are presumably not driven by chaotic dynamics but by finite-size fluctuations (e.g. Litwin-Kumar and Doiron 2012). The latter is also strongly dependent on assembly size. More precisely, the mechanism of how assembly size shapes escape times T seems to be different: in the rate model the self-coupling ("assembly size") predominantly affects the effective potential, whereas in the spiking network, the assembly size predominantly affects the noise. Therefore, the correspondence between the rate model and the spiking model should probably be regarded in a looser sense than presented in the paper.

    Answer 1a. We thank the Reviewer for suggesting to clarify the relationship between the rate and spiking model. In this answer, we first show that the dynamicalmodes in the spiking network are E/I cluster pairs, then we show that assemblies are bistable due to the large self-couplings, and third we discuss whether transitions between high and low activity states are driven by chaos or finite size effects, including correlations between assemblies.

    We first elucidated the dynamical modes in the spiking network and how those can be related to the rate network. Using an approach from (1, 2), we considered the mean-field theory for the spiking network, reducing the degrees of freedom from N neurons to 2p+2 E/I assemblies (plus E/I background populations), then we identified the approximate dynamical modes as E/I cluster pairs emerging as the Schur eigenvectors of the mean field-reduced coupling matrix. Comparing the eigenvalue distribution of the full vs. the mean field-reduced coupling matrix, we found that the slow timescales capturing the assemblies metastable dynamics correspond to the p−1 large positive eigenvalues corresponding to the Schur modes. The heterogeneity in timescales of the spiking model arises from the heterogeneous distribution of these gapped eigenvalues, reflecting the hierarchy in assembly sizes and assembly self-couplings in the mean field approach. We then analyzed the eigenvalues in the rate network with a lognormal self-coupling distribution and found a similar picture, where the slow units are related to the large eigenvalues in the coupling matrix (Appendix 2). We also note that in the rate network, there is no gap in the eigenvaluedistributionas there are many units with small values of the self-couplings. On the other hand in the spiking network the large eigenvalues are p − 1, where p is the number of assemblies, and they are gapped. These new analyses clarify the correspondence between rate network units and spiking network E/I cluster pairs, arising from the Schur picture.

    We now discuss previous studies to examine whether bistability in the spiking network arises from assembly self-couplings or from other effects. Previous mean-field analyses of spiking networks with clustered connectivity showed that the bistability of assembly dynamics is due to the presence of a large self-coupling, rather than from the interactions with other assemblies. We briefly summarize the published evidence for this. The seminal work of (3) showed that in a network with assemblies, a bifurcation in network dynamics emerges when the assembly self-coupling JEE+ > Jc exceeds a critical value Jc; beyond this value, a low and a high activity stable state coexist. Although in this network these two states are stable, more recent work from (4, 5) showed that finite size effects (small assembly size) can destabilize the states, leading to the metastable regime. When the inhibitory population is homogeneous, as in these last two articles, metastability arises from finite size effects and it is sensitive to network parameters (5) and (6). Specifically, when one scales both the network size and the E assembly size, metastability disappears (5). Moreover, when the I population is homogeneous, then E clusters are anti-correlated, as correctly suggested by the Reviewer. However, our model differs from the ones just discussed in that the inhibitory population is also arranged in assemblies, which are reciprocally paired with E assemblies. In this class of E/I clustered models, metastability is robust to changes in network parameters (see (6)). More specifically, in our revised version, we show that metastable dynamics persists when scaling up the network size to N = 10,000 neurons (and scaling up network size with N). A crucial difference between the model with homogeneous I population vs the model with I assemblies (i.e., our model), is that in the former the assemblies are anti-correlated, while in the latter case the assemblies are uncorrelated (see Fig. 1), the same as in the rate network. These results suggest that transitions between metastable states in the spiking network may be driven by a coexistence of two effects: on the one hand, finite size effects due to the small assembly size, and on the other hand, by the heterogeneity in the inter-assembly couplings. Although the former effect is absent in the rate network, the latter is the driver of the chaotic activity observed in the rate network. Thus it is plausible that rate-based chaotic dynamics might also contribute to the metastable activity in the spiking network, although more targeted work should be performed to answer this question. In the revised version of the manuscript, we overhauled the subsection ’A reservoir of timescales in E-I spiking networks’, Fig. 5, and Appendix 2, by adding an extensive explanation of the emergence of slow timescales from the large eigenvalues in the Schur basis, and its comparison between spiking and rate network. In particular, we highlighted the differences between rate and spiking networks and the fact that finite size effects might be at play in the latter case.

    Furthermore, the prediction of the rate model is a quadratic increase of log(T), however, the data shown in Fig.5b do not seem to strongly support this prediction. More details and evidence that the data "was best fit with a quadratic polynomial" would be necessary to test the theoretical prediction.

    We increased the clarity and strengthened the support for the data in Fig 5b as "best fit with a quadratic polynomial" by addinga plot, inset in Fig 5b, alongsidea detailed explanation of the fitting procedure in Methods section (e). Figure 5b inset displays a cross-validatedmodel selection’s training and test errors for polynomial fit. The test error shows a minimal error at a polynomial degree 2, supporting the claim that the best fit was achieved with a quadratic polynomial. In Methods section (e), under "Model selection for timescale fit," we added a detailed description of the cross-validation procedure by which the fit was obtained. A quote from that section in the revised manuscript can also be found in this document under answer 11.

    Question 2. The time scale of a bistable probe unit driven by network-generated "noise" is taken to be the mean dwell time T (mean escape time) in a metastable state. It seems that the expressions Eq.4 and Eq.21 for this time are incorrect. The mean dwell time is given by the mean first-passage time (MFPT) from one potential minimum to the opposite one includingthe full passage across the barrier. At least, the final point for the MFPT should be significantly beyond the barrier to complete the escape. However, the authors only compute the MFPT to a location −xc slightly before the barrier is reached, at which point the probe unit has not managed to escape yet (e.g. it could go back to −x2 after reaching −xc instead of further going to +x2). It is not clear whether the UCNA can be applied to such escape problems because it is valid only in regions, where the potential is convex, and thus the UCNA may break down near the potential barrier. Indeed, the effective potential is not defined near the barrier (see forbidden zone in Fig.4b), and hence it is not clear how to calculate the mean escape time. Nonetheless, the incomplete MFPT computedby the authors seems to qualitatively predict the dependence on the self-coupling parameter s, at least in the example of Fig.4c. However, if the incomplete MFPT is taken as a basis, then the incomplete MFPT should also be used for the white-noise case for a fair comparison. It seems that the corresponding white-noise case is given by Eq.4 with τ1 = 0, which still has the same dependence on the self-coupling s2, contrary to what is claimed in the paper (it is unclear how the curve for the white-noise case in Fig.4 was obtained). Note that the UCNA has been designed such that it is valid for both small and large τ1 (thus, it is also unclear why the assumption of large τ1 is needed).

    Answer 2. We are deeply grateful to the Reviewer for this critical evaluation of our UCNA calculation of the escape times. We will first clarify our rationale and then discuss comparison with the white noise case. The idea behind our calculation is indeed that when starting from the left minimum −x2, the probe effectively escapes to +x2 before reaching the limit of the UCNA support region at −xc. First, our simulations show (Fig 4b light blue colored area) that the probe almost exclusively visits the valid areas |x| > xc: our new analysis shows that the fraction of activity spent in the forbidden region is (1.8+/ −0.4)×10−3 (mean±SD over 10 probe units run with parameters as in Fig. 4a-b), confirming the fact that the histogram of x values from simulations has almost null support in the forbidden region |x| < xc. This is also supported by the representative simulation time course in Fig. 4a which exhibits abrupt jumps between the two bistable states. We then estimated the ’escape point’ from simulations as follows: for a transition from the x = −s2 well towards the x = +x2 well, the escape point is defined as the point where x on the side of the source well, i.e. x < 0, but the trajectory starts accelerating towards the target well (positive second derivative). We found that the distribution of escape points was predominantly in the allowed region (93.8%). This analysis supports our method to calculate the MFPT and confirms that our calculation is performed in the valid UCNA region. In the revised version of the manuscript, we added a clarification of this point with text and a new supplementary figure in Fig. 4 Suppl. 1. Regarding the comparison with white noise, we compared white-noise-driven probe dynamics with a probe driven by a network (effectively represented by the colored noise). To adequately make this comparison, we replaced the input coming from the network into the probe unit (Eq 1. rhs last term) with white noise. The rest of the terms in this equation were left untouched to maintain the exact probe’s self-response properties. This procedure aims to understand the unique contribution of the colored noise generated by the network to each unit dynamics by removing its "colored" correlated input contribution but otherwise leaving all dynamical properties the same. For clarity of the manuscript on this subject, we added a paragraph about it under "A comparison with white noise" in Methods section (d).

    We can estimate the mean first passage time (MFPT) of a probe unit driven by white noise with Eq. 4. The procedure described above for switching the network drive with white noise also dictates the parameter values to use in Eq. 4 for the case of white noise. First, with no correlation in white noise τ1 = 0. Second, D, the magnitude of the drive inherits its value from the network (see also Eq. 22) as the strength of the white noise (its integral around zero as a δ function). The results are presented in Fig 4. To strengthen the results and improve the clarity of the text, we expanded the content of Fig 4c. The plot now includes both the results of simulations (Fig. 4c green line) and estimation by mean first passage time (Fig. 4c green dashed line) for white noise, as explained above. We note that the potential in the white noise case (Fig. 4b green dashed line) does include a concave part. Indeed,the agreementbetweenthe distributionretrieved from simulations (Fig. 4b light green area) and its locations’ visit probability approximated by theory (Eq. 19 with τ1 = 0, Fig. 4b green line) are not in full agreement. However, this probability is still a good approximation. As a result, the mean first passage time (Eq. 4, Fig. 4c green dashed line) is a good approximation. The great advantage of having Eq. 4 as an approximation for the mean first passage time is that it clearly explainsthe contributionof each part of the dynamical equation (Eq. 1) towards achieving long timescales. Mainly, since log depends on τ1 linearly, its exponent, the mean first passage time depends on tau1 exponentially. Hence the importance of the color in the input and the vast differences between the network drive and the white noise.

    Question 3. The given argument that the time-scale separation arises as network effect is not very clear. Apart from the issue of a fair comparison of colored and white noise raised in point 1 above, an external colored noise with matched statistics that drives a single bistable unit would yield the same MFPT and thus would be an alternative explanation independent of the network dynamics.

    Answer 3. The goal of our investigation was to uncover a neural mechanism that induces heterogeneous timescales in a self-consistent way. The idea of self-consistencyis the central tenet of our approach, namely, that a timescale distribution must arise due to the internal dynamics of a recurrent circuit without the need to invoke an external auxiliary force driving it. If we had an external colored noise with matched statistics driving the probe unit, then we would still have to explain which mechanism would give rise to that particular statistics of the colored noise - with the most natural explanation being a recurrent network with time-varying activity.

    The second ingredient in our argument demonstrating that it is a network effect is the following. If the time-scale separation was not a network effect, but rather a property of a single probe unit, then it would persist regardless of the specific features of the noise driving the unit. To test this hypothesis, we compared the scenarios of the same probe unit driven by the self-consistent noise generated by the rest of the network, as opposed to white noise, and found that the time-scale separation is not present in the second case. Thus, the time-scale separation is not an intrinsic property of the probe unit, but, rather, it relies on the unit being part of a recurrent network generating a specific kind of noise. This argument is explained in the last paragraph of the section ’Separation of timescales in the bistable chaotic regime’.

    Question 4. The UCNA has assumptions and regimes of validity that are not stated in the paper. In particular, it assumes an Ornstein-Uhlenbeck noise, which has an exponential auto-correlation function, and local stability (region where potential is convex). Because the self-consistent auto-correlation function is generally not exponential and the probe unit also visits regions where the potential is concave, the validity of the UCNA is not clear. On the other hand, the assumption of large correlation time might be dropped as the UCNA’s main feature is that it works for both large and small correlation times.

    Answer 4. We thanks the Reviewer again for this critical evaluation of our assumptions, however, we believe that our approach is justified because of the following two arguments. First, although the UCNA was derived in case of an OU process, it has since then been successfully applied to different classes of noise, including multiplicative noise, harmonic noise, and others (see e.g. (7–9). To the best of our knowledge, the UCNA has never been applied before to noise whose autocorrelation arises from chaotic dynamics, whose hallmark is a vanishing slope at zero lag, markedly different from the OU process. To address the concern about concavity, we performed the additional analyses discussed in our answer to Question 2, showing that the probe unit never visits regions where the potential is concave, which would lie outside of the support of the potential. Because of these two considerations, we believe that the UCNA is valid in our scenario, as suggested by the good agreement between theory and simulation at large values of the self-couplingsin Fig. 4c. Finally, we thank the Reviewer for bringing up the fact that UCNA works for both large and small correlation times, we fixed that in the revised manuscript.

  2. eLife assessment

    This fundamental work uses computational network models to suggest a possible origin of the wide range of time scales observed in cortical activity. This claim is supported by convincing evidence based on comparisons between mathematical theory, simulations of spiking network models, and analysis of recordings from the orbitofrontal cortex. This manuscript will be of interest to the broad community of systems and computational neuroscience.

  3. Reviewer #1 (Public Review):

    The authors aim to theoretically explain the wide range of time scales observed in cortical circuits in the brain -- a fundamental problem in theoretical neuroscience. They propose that the variety of time scales arises in recurrent neural networks with heterogeneous units that represent neuronal assemblies of different sizes that transition through sequences of high- and low-activity metastable states. When transitions are driven by intrinsically generated noise, the heterogeneity leads to a wide range of escape times (and hence time scales) across units. As a mathematically tractable model, they consider a recurrent network of heterogeneous bistable rate units in the chaotic regime. The model is an extension of the previous model by Stern et al (Phys. Rev. E, 2014) to the case of heterogeneous self-coupling parameters. Biologically, this heterogeneous parameter is interpreted as different assembly sizes. The chaoticity acts as intrinsically generated noise-driving transitions between bistable states with escape times that are indeed widely distributed because of the heterogeneity. The distribution is successfully fitted to experimental data. Using previous dynamic mean-field theory, the self-consistent auto-correlation function of the driving noise in the mean-field model is computed (I guess numerically). This leaves the theoretical problem of calculating escape times in the presence of colored noise, which is solved using the unified colored-noise approximation (UCNA). They find that the log of the correlation time of a given unit increases quadratically with the self-coupling strength of that unit, which nicely explains the distribution of time scales over several orders of magnitude. As a biologically plausible implementation of the theory, they consider a spiking neural network with clustered connectivity and heterogeneous cluster sizes (extension of the previous model by Mazzucato et al. J Neurosci 2015). Simulations of this model also exhibit a quadratic increase in the log dwell time with cluster size. Finally, the authors demonstrate that heterogeneous assemblies might be useful to differentially transmit different frequency components of a broadband stimulus through different assemblies because the assembly size modulates the gain.

    I found the paper conceptually interesting and original, especially the analytical part on estimating the mean escape times in the rate network using the idea of probe units and the UCNA. It is a nice demonstration of how chaotic activity serves as noise-driving metastable activity. Calculating the typical time scales of such metastable activity is a hard theoretical problem, for which the authors made considerable advancement. The conclusions of this paper are mostly well supported by simulations and mathematical analysis, but some aspects need to be clarified and extended, especially concerning the biological plausibility of the rate network model and its relation to the spiking neural network model as well as the analytical calculation of the mean dwell time.

    1. The theory is based on a somewhat unbiological network of bistable rate units. It seems to only loosely apply to the implementation with a spiking neural network with clustered architecture, which is used as a biological justification of the rate model. In the spiking model, a wide distribution of time scales also emerges as a consequence of noise-induced escapes in combination with heterogeneity. Apart from this analogy, however, the mechanisms for metastability seem to be quite different: firstly, the functional units in the spiking neural network are presumably not bistable themselves but multistability only emerges as a network effect, i.e. from the interaction with other assemblies and inhibitory neurons. (This difference yields anti-correlations between assemblies in the spiking model, in marked contrast to the independence of bistable rate units (if N is large).) Secondly, transitions between metastable states are presumably not driven by chaotic dynamics but by finite-size fluctuations (e.g. Litwin-Kumar & Doiron 2012). The latter is also strongly dependent on assembly size. More precisely, the mechanism of how assembly size shapes escape times T seems to be different: in the rate model the self-coupling ("assembly size") predominantly affects the effective potential, whereas in the spiking network, the assembly size predominantly affects the noise.

    Furthermore, the prediction of the rate model is a quadratic increase of log(T), however, the data shown in Fig.5b do not seem to strongly support this prediction. More details and evidence that the data "was best fit with a quadratic polynomial" would be necessary to test the theoretical prediction. Therefore, the correspondence between the rate model and the spiking model should probably be regarded in a looser sense than presented in the paper.

    1. The time scale of a bistable probe unit driven by network-generated "noise" is taken to be the mean dwell time T (mean escape time) in a metastable state. It seems that the expressions Eq.4 and Eq.21 for this time are incorrect. The mean dwell time is given by the mean first-passage time (MFPT) from one potential minumum to the opposite one including the full passage across the barrier. At least, the final point for the MFPT should be significantly beyond the barrier to complete the escape. However, the authors only compute the MFPT to a location -x_c slightly before the barrier is reached, at which point the probe unit has not managed to escape yet (e.g. it could go back to -x_2 after reaching -x_c instead of further going to +x_2). It is not clear whether the UCNA can be applied to such escape problems because it is valid only in regions, where the potential is convex, and thus the UCNA may break down near the potential barrier. Indeed, the effective potential is not defined near the barrier (see forbidden zone in Fig.4b), and hence it is not clear how to calculate the mean escape time. Nonetheless, the incomplete MFPT computed by the authors seems to qualitatively predict the dependence on the self-coupling parameter s, at least in the example of Fig.4c. However, if the incomplete MFPT is taken as a basis, then the incomplete MFPT should also be used for the white-noise case for a fair comparison. It seems that the corresponding white-noise case is given by Eq.4 with tau_1=0, which still has the same dependence on the self-coupling s_2, contrary to what is claimed in the paper (it is unclear how the curve for the white-noise case in Fig.4 was obtained). Note that the UCNA has been designed such that it is valid for both small and large tau_1 (thus, it is also unclear why the assumption of large tau_1 is needed).

    2. The given argument that the time-scale separation arises as network effect is not very clear. Apart from the issue of a fair comparison of colored and white noise raised in point 1 above, an external colored noise with matched statistics that drives a single bistable unit would yield the same MFPT and thus would be an alternative explanation independent of the network dynamics.

    3. The UCNA has assumptions and regimes of validity that are not stated in the paper. In particular, it assumes an Ornstein-Uhlenbeck noise, which has an exponential auto-correlation function, and local stability (region where potential is convex). Because the self-consistent auto-correlation function is generally not exponential and the probe unit also visits regions where the potential is concave, the validity of the UCNA is not clear. On the other hand, the assumption of large correlation time might be dropped as the UCNA's main feature is that it works for both large and small correlation times.

  4. Reviewer #2 (Public Review):

    It is well known that introducing clusters in balanced random networks leads to metastable dynamics that potentially span long time scales. The authors build on their previous work (Stern et al. 2014) and here show that the lifetime of metastable states depends on the size of the individual activated clusters. Showing qualitative similarities between clustered spiking networks and networks of bistable rate units, the authors further derive dynamic mean-field predictions for the separation of time scales of the dynamics in relation to differences in the strength of self-couplings in rate networks. Further, they confirm these results in simulations of spiking networks and compare them to time scales observed in the orbitofrontal cortex. Finally, the authors show that assemblies of a particular size (and thus time scale) get entrained by specific external input frequencies, allowing the network to demix temporal signals in a spatial manner.

    The manuscript is in general well written and addresses a timely and important topic in neuroscience. However, there are concerns related to the discussion of alternative mechanisms for a large repertoire of time scales as well as the relation between the spiking and rate network model.