## Article activity feed

1. Author Response

Reviewer #1 (Public Review):

Jones et al. investigated the relationship between scale free neural dynamics and scale free behavioral dynamics in mice. An extensive prior literature has documented scale free events in both cortical activity and animal behavior, but the possibility of a direct correspondence between the two has not been established. To test this link, the authors took advantage of previously published recordings of calcium events in thousands of neurons in mouse visual cortex and simultaneous behavioral data. They find that scale free-ness in spontaneous behavior co occurs with scale free neuronal dynamics. The authors show that scale free neural activity emerges from subsets of the larger population - the larger population contains anticorrelated subsets that cancel out one another's contribution to population-level events. The authors propose an updated model of the critical brain hypothesis that accounts for the obscuring impact of large populations on nested subsets that generate scale free activity. The possibility that scale free activity, and specifically criticality, may serve as a unifying theory of brain organization has suffered from a lack of high-resolution connection between observations of neuronal statistics and brain function. By bridging theory, neural data, and behavioral dynamics, these data add a valuable contribution to fields interested in cortical dynamics and spontaneous behavior, and specifically to the intersection of statistical physics and neuroscience.

Strengths:

This paper is notably well written and thorough.

The authors have taken a cutting-edge, high-density dataset and propose a data-driven revision to the status-quo theory of criticality. More specifically, due to the observed anticorrelated dynamics of large populations of neurons (which doesn't fit with traditional theories of criticality), the authors present a clever new model that reveals critical dynamics nested within the summary population behavior.

The conclusions are supported by the data.

Avalanching in subsets of neurons makes a lot of sense - this observation supports the idea that multiple, independent, ongoing processes coexist in intertwined subsets of larger networks. Even if this is wrong, it's supported well by the current data and offers a plausible framework on which scale free dynamics might emerge when considered at the levels of millions or billions of neurons.

The authors present a new algorithm for power law fitting that circumvents issues in the KS test that is the basis of most work in the field.

Weaknesses:

This paper is technically sound and does not have major flaws, in my opinion. However, I would like to see a detailed and thoughtful reflection on the role that 3 Hz Ca imaging might play in the conclusions that the authors derive. While the dataset in question offers many neurons, this approach is, from other perspectives, impoverished - calcium intrinsically misses spikes, a 3 Hz sampling rate is two orders of magnitude slower than an action potential, and the recordings are relatively short for amassing substantial observations of low probability (large) avalanches. The authors carefully point out that other studies fail to account for some of the novel observations that are central to their conclusions. My speculative concern is that some of this disconnect may reflect optophysiological constraints. One argument against this is that a truly scale free system should be observable at any temporal or spatial scale and still give rise to the same sets of power laws. This quickly falls apart when applied to biological systems which are neither infinite in time nor space. As a result, the severe mismatch between the spatial resolution (single cell) and the temporal resolution (3 Hz) of the dataset, combined with filtering intrinsic to calcium imaging, raises the possibility that the conclusions are influenced by the methods. Ultimately, I'm pointing to an observer effect, and I do not think this disqualifies or undermines the novelty or potential value of this work. I would simply encourage the authors to consider this carefully in the discussion.

R1a: We quite agree with the reviewer that reconciling different scales of measurement is an important and interesting question. One clue comes from Stringer et al’s original paper (2019 Science). They analyzed time-resolved spike data (from Neuropixel recordings) alongside the Ca imaging data we analyzed here. They showed that if the ephys spike data was analyzed with coarse time resolution (300 ms time bins, analogous to the Ca imaging data), then the anticorrelated activity became apparent (50/50 positive/negative loadings of PC1). When analyzed at faster time scales, anticorrelations were not apparent (mostly positive loadings of PC1). This interesting point was shown in their Supplementary Fig 12.

This finding suggests that our findings about anticorrelated neural groups may be relevant only at coarse time scales. Moreover, this point suggests that avalanche statistics may differ when analyzed at very different time scales, because the cancelation of anticorrelated groups may not be an important factor at faster timescales.

In our revised manuscript, we explored this point further by analyzing spike data from Stringer et al 2019. We focused on the spikes recorded from one local population (one Neuropixel probe). We first took the spike times of ~300 neurons and convolved them with a fast rise/slow fall, like typical Ca transient. Then we downsampled to 3 Hz sample rate. Next, we deconvolved using the same methods as those used by Stringer et al (OASIS nonnegative deconvolution). And finally, we z-scored the resulting activity, as we did with the Ca imaging data. With this Ca-like signal in hand, we analyzed avalanches in four ways and compared the results. The four ways were: 1) the original time-resolved spikes (5 ms resolution), 2) the original spikes binned at 330 ms time res, 3) the full population of slow Ca-like signal, and 4) a correlated subset of neurons from the slow Ca-like signal. Based on the results of this new analysis (now in Figs S3 and S4), we found several interesting points that help reconcile potential differences between fast ephys and slow Ca signals:

1. In agreement with Sup Fig 12 from Stringer et al, anticorrelations are minimal in the fast, time-resolved spike data, but can be dominant in the slow, Ca-like signal.

2. Avalanche size distributions of spikes at fast timescales can exhibit a nice power law, consistent with previous results with exponents near -2 (e.g. Ma et al Neuron 2019, Fontenele et al PRL 2019). But, the same data at slow time scales exhibited poor power-laws when the entire population was considered together.

3. The slow time scale data could exhibit a better power law if subsets of neurons were considered, just like our main findings based on Ca imaging. This point was the same using coarse time-binned spike data and the slow Ca-like signals, which gives us some confidence that deconvolution does not miss too many spikes.

In our opinion, a more thorough understanding of how scale-free dynamics differs across timescales will require a whole other paper, but we think these new results in our Figs S3 and S4 provide some reassurance that our results can be reconciled with previous work on scale free neural activity at faster timescales.

Reviewer #2 (Public Review):

The overall goal of the paper is to link spontaneous neural activity and certain aspects of spontaneous behavior using a publicly available dataset in which 10,000 neurons in mouse visual cortex were imaged at 3 Hz with single-cell resolution. Through careful analysis of the degree to which bouts of behavior and bouts of neural activity are described (or not) by power-law distributions, the authors largely achieve these goals. More specifically, the key findings are that (a) the size of bouts of whisking, running, eye movements, and pupil dilation are often well-fit by a power-law distribution over several decades, (b) subsets of neurons that are highly correlated with one of these behavioral metrics will also exhibit power-law distributed event sizes, (c) neuron clusters that are uncorrelated with behavior tend to not be scale-free, (d) crackling relationships are generally not found (i.e. size with duration exponent (if there is scaling) was not predicted by size power-law and duration power-law), (e) bouts of behavior could be linked to bouts of neural activity. In the second portion of the paper, the authors develop a computational model with sets of correlated and anti-correlated neurons, which can be accomplished under a relatively small subset of connection architectures: out of the hundreds of thousands of networks simulated, only 31 generated scale-free subsets/non-scale-free population/anti correlated e-cells/anti-correlated i-cells in agreement with the experimental recordings.

The data analysis is careful and rigorous, especially in the attention to fitting power laws, determining how many decades of scaling are observed, and acknowledging when a power-law fit is not justified. In my view, there are two weaknesses of the paper, related to how the results connect to past work and to the set-up and conclusions drawn from the computational modeling, and I discuss those in detail below. While my comments are extensive, this is due to high interest. I do think that the authors make an important connection between scale-free distributions of neural activity and behavior, and that their use of computational modeling generates some interesting mechanistic hypotheses to explore in future work.

My first general reservation is in the relationship to past work and the overall novelty. The authors state in the introduction, "according to the prevailing view, scale-free ongoing neural activity is interpreted as 'background' activity, not directly linked to behavior." It would be helpful to have some specific references here, as several recent papers (including the Stringer et al. 2019 paper from which these data were taken, but also papers from McCormick lab and (Anne) Churchland lab) showed a correlation between spontaneous activity and spontaneous facial behaviors. To my knowledge, the sorts of fidgety behavior analyzed in this paper have not been shown to be scale-free, and so (a) is a new result, but once we know this, it seems that (e) follows because we fully expect some neurons to correlate with some behavior.

R2a: We agree with the reviewer that our original introductory, motivating arguments needed improvement. We have now rewritten the last 2 paragraphs of the introduction. We hope we have now laid out our argument more clearly, with more appropriate supporting citations. In brief, the logic is this:

1. Previous theory, modeling, and experiments on the topic of scale-free neural activity suggest that this phenomenon is an autonomous, internally generated thing, independent of anything the body is doing.

2. Relatively new experiments (including those by Churchland’s lab and McCormmick’s lab: Stringer 2019; Salkoff 2020; Clancy 2019; Musall 2019) suggest a different picture with a link between spontaneous behaviors and ongoing cortical activity, but these studies did not address any questions about scale-free-ness.

3. Moreover, these new experiments show that behavioral variables only manage to explain about 10-30% of ongoing activity.

4. Is this behaviorally-explainable 10-30% scale-free or perhaps the scale-free aspects of cortical dynamics fall withing the other 70-90%. Our goal is to find out.

Digging a bit more on this issue, I would argue that results (b) and (c) also follow. By selecting subsets of neurons with very high cross-correlation, an effective latent variable has emerged. For example, the activity rasters of these subsets are similar to a population in which each neuron fires with the same time-varying rate (i.e., a heterogeneous Poisson process). Such models have been previously shown to be able to generate power-law distributed event sizes (see, eg., Touboul and Destexhe, 2017; also work by Priesemann). With this in mind, if you select from the entire population a set of neurons whose activity is effectively determined by a latent variable, do you not expect power laws in size distributions?

Our understanding is that not all Poisson processes with a time-varying rate will result in a power law. It is quite essential that the fluctuations in rate must themselves be power-law distributed. As a clear example of how this breaks down, consider a Poisson rate that varies according to a sine wave with fixed period and amplitude. In this case, the avalanche size distribution is definitely not scale-free, it would have a clear typical scale. Another point of view on this comes from some of the simplest models used to study criticality – e.g. all-to-all connected probabilistic binary neurons (like in Shew et al 2009 J Neurosi). These models do generate spiking with a time-varying Poisson rate when they are at criticality or away from criticality. But, only when the synaptic strength is tuned to criticality is the time-varying rate going to generate power-law distributed avalanches. I think the Priesmann & Shriki paper made this point as well.

My second reservation has to do with the generality of the conclusions drawn from the mechanistic model. One of the connectivity motifs identified appears to be i+ to e- and i- to e+, where potentially i+/i- are SOM and VIP (or really any specific inhibitory type) cells. The specific connections to subsets of excitatory cells appear to be important (based on the solid lines in Figure 8). This seems surprising: is there any experimental support for excitatory cells to preferentially receive inhibition from either SOM or VIP, but not both?

R2b: There is indeed direct experimental support for the competitive relationship between SOM, VIP, and functionally distinct groups of excitatory neurons. This was shown in the paper by Josh Trachtenberg’s group: Garcia-Junco-Clemente et al 2017. An inhibitory pull-push circuit in frontal cortex. Nat Neurosci 20:389–392. However, we emphasize that we also showed (lower left motif in Fig 8G) that a simpler model with only one inhibitory group is sufficient to explain the anticorrelations and scale-free dynamics we observe. We opted to highlight the model with two inhibitory groups since it can also account for the Garcia-Junco-Clemente et al results.

In the section where we describe the model, we state, “We considered two inhibitory groups, instead of just one, to account for previous reports of anticorrelations between VIP and SOM inhibitory neurons in addition to anticorrelations between groups of excitatory neurons (Garcia-Junco-Clemente et al., 2017).”

More broadly, I wonder if the neat diagrams drawn here are misleading. The sample raster, showing what appears to be the full simulation, certainly captures the correlated/anti-correlated pattern of the 100 cells most correlated with a seed cell and 100 cells most anti-correlated with it, but it does not contain the 11,000 cells in between with zero to moderate levels of correlation.

R2c: We agree that our original model has several limitations and that one of the most obvious features lacking in our model is asynchronous neurons (The limitations are now discussed more openly in the last paragraph of the model subsection). In the data from the Garcia-Junco-Clemente et al paper above there are many asynchronous neurons as well. To ameliorate this limitation, we have now created a modified model that now accounts for asynchronous neurons together with the competing anticorrelated neurons (now shown and described in Fig S9). We put this modified model in supplementary material and kept the simpler, original model in the main findings of our work, because the original model provides a simpler account of the features of the data we focused on in our work – i.e. anticorrelated scale-free fluctuations. The addition of the asynchronous population does not substantially change the behavior of the two anticorrelated groups in the original model.

We probably expect that the full covariance matrix has similar structure from any seed (see Meshulam et al. 2019, PRL, for an analysis of scaling of coarse-grained activity covariance), and this suggests multiple cross-over inhibition constraints, which seem like they could be hard to satisfy.

R2d: We agree that it remains an outstanding challenge to create a model that reproduces the full complexity of the covariance matrix. We feel that this challenge is beyond the scope of this paper, which is already arguably squeezing quite a lot into one manuscript (one reviewer already suggested removing figures!).

We added a paragraph at the end of the subsection about the model to emphasize this limitation of the model as well as other limitations. This new paragraph says:

While our model offers a simple explanation of anticorrelated scale-free dynamics, its simplicity comes with limitations. Perhaps the most obvious limitation of our model is that it does not include neurons with weak correlations to both e+ and e- (those neurons in the middle of the correlation spectrum shown in Fig 7B). In Fig S9, we show that our model can be modified in a simple way to include asynchronous neurons. Another limitation is that we assumed that all non-zero synaptic connections were equal in weight. We loosen this assumption allowing for variable weights in Fig S9, without changing the basic features of anticorrelated scale-free fluctuations. Future work might improve our model further by accounting for neurons with intermediate correlations.

The motifs identified in Fig. 8 likely exist, but I am left with many questions of what we learned about connectivity rules that would account for the full distribution of correlations. Would starting with an Erdos-Renyi network with slight over-representation of these motifs be sufficient? How important is the homogeneous connection weights from each pool assumption - would allowing connection weights with some dispersion change the results?

R2e: First, we emphasize that our specific goal with our model was to identify a possible mechanism for the anticorrelated scale-free fluctuations that played the key role in our analyses. We agree that this is not a complete account of all correlations, but this was not the goal of our work. Nonetheless, our new modified model in Fig S9 now accounts for additional neurons with weak correlations. However, we think that future theoretical/modeling work will be required to better account for the intermediate correlations that are also present in the experimental data.

We confirmed that an Erdo-Renyi network of E and I neurons can produce scale-free dynamics, but cannot produce substantial anticorrelated dynamics (Fig 8G, top right motif). Additionally, the parameter space study we performed with our model in Fig 8 showed that if the interactions between the two excitatory groups exceed a certain tipping point density, then the model behavior switches to behavior expected from an Erdos-Renyi network (Fig 8F). Finally, we have now confirmed that some non-uniformity of synaptic weights does not change the main results (Fig S9). In the model presented in Fig S9, the value of each non-zero connection weight was drawn from a uniform distribution [0,0.01] or [-0.01,0] for excitatory and inhibitory connections, respectively. All of these facts are described in the model subsection of the paper results.

As a whole, this paper has the potential to make an impact on how large-scale neural and behavioral recordings are analyzed and interpreted, which is of high interest to a large contingent of the field.

Reviewer #3 (Public Review):

The primary goal of this work is to link scale free dynamics, as measured by the distributions of event sizes and durations, of behavioral events and neuronal populations. The work uses recordings from Stringer et al. and focus on identifying scale-free models by fitting the log-log distribution of event sizes. Specifically, the authors take averages of correlated neural sub-populations and compute the scale-free characterization. Importantly, neither the full population average nor random uncorrelated subsets exhibited scaling free dynamics, only correlated subsets. The authors then work to relate the characterization of the neuronal activity to specific behavioral variables by testing the scale-free characteristics as a function of correlation with behavior. To explain their experimental observation, the authors turn to classic e-i network constructions as models of activity that could produce the observed data. The authors hypothesize that a winner-take-all e-i network can reproduce the activity profiles and therefore might be a viable candidate for further study. While well written, I find that there are a significant number of potential issues that should be clarified. Primarily I have main concerns: 1) The data processing seems to have the potential to distort features that may be important for this analysis (including missed detections and dynamic range), 2) The analysis jumps right to e-i network interactions, while there seems to be a much simpler, and more general explanation that seems like it could describe their observations (which has to do with the way they are averaging neurons), and 3) that the relationship between the neural and behavioral data could be further clarified by accounting for the lop-sidedness of the data statistics. I have included more details below about my concerns below.

Main points:

1. Limits of calcium imaging: There is a large uncertainty that is not accounted for in dealing with smaller events. In particular there are a number of studies now, both using paired electro-physiology and imaging [R1] and biophysical simulations [R2] that show that for small neural events are often not visible in the calcium signal. Moreover, this problem may be exacerbated by the fact that the imaging is at 3Hz, much lower than the more typical 10-30Hz imaging speeds. The effects of this missing data should be accounted for as could be a potential source of large errors in estimating the neural activity distributions.

R3a: We appreciate the concern here and agree that event size statistics could in principle be biased in some systematic way due to missed spikes due to deconvolution of Ca signals. To directly test this possibility, we performed a new analysis of spike data recorded with high time resolution electrophysiology. We began with forward-modeling process to create a low-time-resolution, Ca-like signal, using the same deconvolution algorithm (OASIS) that was used to generate the data we analyzed in our work here. In agreement with the reviewer’s concern, we found that spikes were sometimes missed, but the loss was not extreme and did not impact the neural event size statistics in a significant way compared to the ground truth we obtained directly from the original spike data (with no loss of spikes). This new work is now described in a new paragraph at the end of the subsection of results related to Fig 3 and in a new Fig S3. The new paragraph says…

Two concerns with the data analyzed here are that it was sampled at a slow time scale (3 Hz frame rate) and that the deconvolution methods used to obtain the data here from the raw GCAMP6s Ca imaging signals are likely to miss some activity (Huang et al., 2021). Since our analysis of neural events hinges on summing up activity across neurons, could it be that the missed activity creates systematic biases in our observed event size statistics? To address this question, we analyzed some time-resolved spike data (Neuropixel recording from Stringer et al 2019). Starting from the spike data, we created a slow signal, similar to that we analyzed here by convolving with a Ca-transient, down sampling, deconvolving, and z-scoring (Fig S3). We compared neural event size distributions to “ground truth” based on the original spike data (with no loss of spikes) and found that the neural event size distributions were very similar, with the same exponent and same power-law range (Fig S3). Thus, we conclude that our reported neural event size distributions are reliable.

However, although loss of spikes did not impact the event size distributions much, the time-scale of measurement did matter. As discussed above and shown in Fig S4, changing from 5 ms time resolution to 330 ms time resolution does change the exponent and the range of the power law. However, in the test data set we worked with, the existence of a power law was robust across time scales.

1. Correlations and power-laws in subsets. I have a number of concerns with how neurons are selected and partitioned to achieve scale-free dynamics. 2a) First, it's unclear why the averaging is required in the first place. This operation projects the entire population down in an incredibly lossy way and removes much of the complexity of the population activity.

R3b: Our population averaging approach is motivated by theoretical predictions and previous work. According to established theoretical accounts of scale-free population events (i.e. non-equilibrium critical phenomena in neural systems) such population-summed event sizes should have power law statistics if the system is near a critical point. This approach has been used in many previous studies of scale-free neural activity (e.g. all of those cited in the introduction in relation to scale-free neuronal avalanches). One of the main results of our study is that the existing theories and models of critical dynamics in neural systems fail to account for small subsets of neurons with scale-free activity amid a larger population that does not conform to these statistics. We could not make this conclusion if we did not test the predictions of those existing theories and models.

2b) Second, the authors state that it is highly curious that subsets of the population exhibit power laws while the entire population does not. While the discussion and hypothesizing about different e-i interactions is interesting I believe that there's a discussion to be had on a much more basic level of whether there are topology independent explanations, such as basic distributions of correlations between neurons that can explain the subnetwork averaging. Specifically, if the correlation to any given neuron falls off, e.g., with an exponential falloff (i.e., a Gaussian Process type covariance between neurons), it seems that similar effects should hold. This type of effect can be easily tested by generating null distributions using code bases such as [R3]. I believe that this is an important point, since local (broadly defined) correlations of neurons implying the observed subnetwork behavior means that many mechanisms that have local correlations but don't cluster in any meaningful way could also be responsible for the local averaging effect.

R3c: We appreciate the reviewer’s effort, trying out some code to generate a statistical model. We agree that we could create such a statistical model that describes the observed distribution of pairwise correlations among neurons. For instance, it would be trivial to directly measure the covariance matrix, mean activities, and autocorrelations of the experimental data, which would, of course, provide a very good statistical description of the data. It would also be simple to generate more approximate statistical descriptions of the data, using multivariate gaussians, similar to the code suggested by the reviewer. However, we emphasize, this would not meet the goal of our modeling effort, which is mechanistic, not statistical. The aim of our model was to identify a possible biophysical mechanism from which emerge certain observed statistical features of the data. We feel that a statistical model is not a suitable strategy to meet this aim. Nonetheless, we agree with the reviewer that clusters with sharp boundaries (like the distinction between e+ an e- in our model) are not necessary to reproduce the cancelation of anticorrelated neurons. In other words, we agree that sharp boundaries of the e+ and e- groups of our model are not crucial ingredients to match our observations.

2c) In general, the discussion of "two networks" seems like it relies on the correlation plot of Figure~7B. The decay away from the peak correlation is sharp, but there does not seem to be significant clustering in the anti-correlation population, instead a very slow decay away from zero. The authors do not show evidence of clustering in the neurons, nor any biophysical reason why e and i neurons are present in the imaging data.

R3d: First a small reminder: As stated in the paper, the data here is only showing activity of excitatory neurons. Inhibitory neurons are certainly present in V1, but they are not recorded in this data set. Thus we interpret our e+ and e- groups as two subsets of anticorrelated excitatory neurons, like those we observed in the experimental data. We agree that our simplified model treats the anticorrelated subsets as if they are clustered, but this clustering is certainly not required for any of the data analyses of experimental data. We expect that our model could be improved to allow for a less sharp boundary between e+ and e- groups, but we leave that for future work, because it is not essential to most of the results in the paper. This limitation of the model is now stated clearly in the last paragraph of the model subsection.

The alternative explanation (as mentioned in (b)) is that the there is a more continuous set of correlations among the neurons with the same result. In fact I tested this myself using [R3] to generate some data with the desired statistics, and the distribution of events seems to also describe this same observation. Obviously, the full test would need to use the same event identification code, and so I believe that it is quite important that the authors consider the much more generic explanation for the sub-network averaging effect.

R3e: As discussed above, we respectfully disagree that a statistical model is an acceptable replacement for a mechanistic model, since we are seeking to understand possible biophysical mechanisms. A statistical model is agnostic about mechanisms. We have nothing against statistical models, but in this case, they would not serve our goals.

To emphasize our point about the inadequacy of a statistical model for our goals, consider the following argument. Imagine we directly computed the mean activities, covariance matrix, and autocorrelations of all 10000 neurons from the real data. Then, we would have in hand an excellent statistical model of the data. We could then create a surrogate data set by drawing random numbers from a multivariate gaussian with same statistical description (e.g. using code like that offered by reviewer 3). This would, by construction, result in the same numbers of correlated and anticorrelated surrogate neurons. But what would this tell us about the biophysical mechanisms that might underlie these observations? Nothing, in our opinion.

2d) Another important aspect here is how single neurons behave. I didn't catch if single neurons were stated to exhibit a power law. If they do, then that would help in that there are different limiting behaviors to the averaging that pass through the observed stated numbers. If not, then there is an additional oddity that one must average neurons at all to obtain a power law.

R3f: We understand that our approach may seem odd from the point of view of central-limit-theorem-type argument. However, as mentioned above (reply R3b) and in our paper, there is a well-established history of theory and corresponding experimental tests for power-law distributed population events in neural systems near criticality. The prediction from theory is that the population summed activity will have power-law distributed events or fluctuations. That is the prediction that motivates our approach. In these theories, it is certainly not necessary that individual neurons have power-law fluctuations on their own. In most previous theories, it is necessary to consider the collective activity of many neurons before the power-law statistics become apparent, because each individual neurons contributes only a small part to the emergent, collective fluctuations. This phenomenon does not require that each individual neuron have power-law fluctuations.

At the risk of being pedantic, we feel obliged to point out that one cannot understand the peculiar scale-free statistics that occur at criticality by considering the behavior of individual elements of the system; hence the notion that critical phenomena are “emergent”. This important fact is not trivial and is, for example, why there was a Nobel prize awarded in physics for developing theoretical understanding of critical phenomena.

1. There is something that seems off about the range of \beta values inferred with the ranges of \tau and $\alpha$. With \tau in [0.9,1.1], then the denominator 1-\tau is in [-0.1, 0.1], which the authors state means that \beta (found to be in [2,2.4]) is not near \beta_{crackling} = (\alpha-1)/(1-\tau). It seems as this is the opposite, as the possible values of the \beta_{crackling} is huge due to the denominator, and so \beta is in the range of possible \beta_{crackling} almost vacuously. Was this statement just poorly worded?

R3g: The point here is that theory of crackling noise predicts that the fit value of beta should be equal to (1-alpha)/(1-tau). In other words, a confirmation of the theory would have all the points on the unity line in the rightmost panels of Fig9D and 9E, not scattered by more than an order of magnitude around the unity line. (We now state this explicitly in the text where Fig 9 is discussed.) Broad scatter around the unity line means the theory prediction did not hold. This is well established in previous studies of scale-free brain dynamics and crackling noise theory (see for example Ma et al Neuron 2019, Shew et al Nature Physics 2015, Friedman et al PRL 2012). A clearer single example of the failure of the theory to predict beta is shown in Fig 5A,B, and C.

1. Connection between brain and behavior:

4a) It is not clear if there is more to what the authors are trying to say with the specifics of the scale free fits for behavior. From what I can see those results are used to motivate the neural studies, but aside from that the details of those ranges don't seem to come up again.

R3h: The reviewer is correct, the primary point in Fig 2 is that scale-free behavioral statistics often exist. Beyond this point about existence, reporting of the specific exponents and ranges is just standard practice for this kind of analysis; a natural question to ask after claiming that we find scale behavior is “what are the exponents and ranges”. We would be remiss not to report those numbers.

4b) Given that the primary connection between neuronal and behavioral activity seems to be Figure~4. The distribution of points in these plots seem to be very lopsided, in that some plots have large ranges of few-to-no data points. It would be very helpful to get a sense of the distribution of points which are a bit hard to see given the overlapping points and super-imposed lines.

R3i: We agree that this whitespace in the figure panels is a somewhat awkward, but we chose to keep the horizontal axis the same for all panels of Fig 4B, because this shows that not all behaviors, and not all animals had the same range of behavioral correlations. We felt that hiding this was a bit misleading, so we kept the white space.

4c) Neural activity correlated with some behavior variables can sometimes be the most active subset of neurons. This could potentially skew the maximum sizes of events and give behaviorally correlated subsets an unfair advantage in terms of the scale-free range.

2. Evaluation Summary:

This paper is of interest to neuroscientists studying the organization of neural activity and of behavior. The authors link the apparently scale-free distributions of behavioral metrics with scale-free distributions of neural activity, and then explore computationally mechanistic models that could account for these observations. While the alternative view set up in the introduction - that scale-free neural activity is "'background activity', not linked to behavior" - is perhaps overly simplistic, the analysis is thorough, and the mechanistic insights garnered from the computational modeling are intriguing.

(This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

3. Reviewer #1 (Public Review):

Jones et al. investigated the relationship between scale free neural dynamics and scale free behavioral dynamics in mice. An extensive prior literature has documented scale free events in both cortical activity and animal behavior, but the possibility of a direct correspondence between the two has not been established. To test this link, the authors took advantage of previously published recordings of calcium events in thousands of neurons in mouse visual cortex and simultaneous behavioral data. They find that scale free-ness in spontaneous behavior co occurs with scale free neuronal dynamics. The authors show that scale free neural activity emerges from subsets of the larger population - the larger population contains anticorrelated subsets that cancel out one another's contribution to population-level events. The authors propose an updated model of the critical brain hypothesis that accounts for the obscuring impact of large populations on nested subsets that generate scale free activity. The possibility that scale free activity, and specifically criticality, may serve as a unifying theory of brain organization has suffered from a lack of high-resolution connection between observations of neuronal statistics and brain function. By bridging theory, neural data, and behavioral dynamics, these data add a valuable contribution to fields interested in cortical dynamics and spontaneous behavior, and specifically to the intersection of statistical physics and neuroscience.

Strengths:
This paper is notably well written and thorough.

The authors have taken a cutting-edge, high-density dataset and propose a data-driven revision to the status-quo theory of criticality. More specifically, due to the observed anticorrelated dynamics of large populations of neurons (which doesn't fit with traditional theories of criticality), the authors present a clever new model that reveals critical dynamics nested within the summary population behavior.

The conclusions are supported by the data.

Avalanching in subsets of neurons makes a lot of sense - this observation supports the idea that multiple, independent, ongoing processes coexist in intertwined subsets of larger networks. Even if this is wrong, it's supported well by the current data and offers a plausible framework on which scale free dynamics might emerge when considered at the levels of millions or billions of neurons.

The authors present a new algorithm for power law fitting that circumvents issues in the KS test that is the basis of most work in the field.

Weaknesses:
This paper is technically sound and does not have major flaws, in my opinion. However, I would like to see a detailed and thoughtful reflection on the role that 3 Hz Ca imaging might play in the conclusions that the authors derive. While the dataset in question offers many neurons, this approach is, from other perspectives, impoverished - calcium intrinsically misses spikes, a 3 Hz sampling rate is two orders of magnitude slower than an action potential, and the recordings are relatively short for amassing substantial observations of low probability (large) avalanches. The authors carefully point out that other studies fail to account for some of the novel observations that are central to their conclusions. My speculative concern is that some of this disconnect may reflect optophysiological constraints. One argument against this is that a truly scale free system should be observable at any temporal or spatial scale and still give rise to the same sets of power laws. This quickly falls apart when applied to biological systems which are neither infinite in time nor space. As a result, the severe mismatch between the spatial resolution (single cell) and the temporal resolution (3 Hz) of the dataset, combined with filtering intrinsic to calcium imaging, raises the possibility that the conclusions are influenced by the methods. Ultimately, I'm pointing to an observer effect, and I do not think this disqualifies or undermines the novelty or potential value of this work. I would simply encourage the authors to consider this carefully in the discussion.

4. Reviewer #2 (Public Review):

The overall goal of the paper is to link spontaneous neural activity and certain aspects of spontaneous behavior using a publicly available dataset in which 10,000 neurons in mouse visual cortex were imaged at 3 Hz with single-cell resolution. Through careful analysis of the degree to which bouts of behavior and bouts of neural activity are described (or not) by power-law distributions, the authors largely achieve these goals. More specifically, the key findings are that (a) the size of bouts of whisking, running, eye movements, and pupil dilation are often well-fit by a power-law distribution over several decades, (b) subsets of neurons that are highly correlated with one of these behavioral metrics will also exhibit power-law distributed event sizes, (c) neuron clusters that are uncorrelated with behavior tend to not be scale-free, (d) crackling relationships are generally not found (i.e. size with duration exponent (if there is scaling) was not predicted by size power-law and duration power-law), (e) bouts of behavior could be linked to bouts of neural activity. In the second portion of the paper, the authors develop a computational model with sets of correlated and anti-correlated neurons, which can be accomplished under a relatively small subset of connection architectures: out of the hundreds of thousands of networks simulated, only 31 generated scale-free subsets/non-scale-free population/anti correlated e-cells/anti-correlated i-cells in agreement with the experimental recordings.

The data analysis is careful and rigorous, especially in the attention to fitting power laws, determining how many decades of scaling are observed, and acknowledging when a power-law fit is not justified. In my view, there are two weaknesses of the paper, related to how the results connect to past work and to the set-up and conclusions drawn from the computational modeling, and I discuss those in detail below. While my comments are extensive, this is due to high interest. I do think that the authors make an important connection between scale-free distributions of neural activity and behavior, and that their use of computational modeling generates some interesting mechanistic hypotheses to explore in future work.

My first general reservation is in the relationship to past work and the overall novelty. The authors state in the introduction, "according to the prevailing view, scale-free ongoing neural activity is interpreted as 'background' activity, not directly linked to behavior." It would be helpful to have some specific references here, as several recent papers (including the Stringer et al. 2019 paper from which these data were taken, but also papers from McCormick lab and (Anne) Churchland lab) showed a correlation between spontaneous activity and spontaneous facial behaviors. To my knowledge, the sorts of fidgety behavior analyzed in this paper have not been shown to be scale-free, and so (a) is a new result, but once we know this, it seems that (e) follows because we fully expect some neurons to correlate with some behavior.

Digging a bit more on this issue, I would argue that results (b) and (c) also follow. By selecting subsets of neurons with very high cross-correlation, an effective latent variable has emerged. For example, the activity rasters of these subsets are similar to a population in which each neuron fires with the same time-varying rate (i.e., a heterogeneous Poisson process). Such models have been previously shown to be able to generate power-law distributed event sizes (see, eg., Touboul and Destexhe, 2017; also work by Priesemann). With this in mind, if you select from the entire population a set of neurons whose activity is effectively determined by a latent variable, do you not expect power laws in size distributions?

My second reservation has to do with the generality of the conclusions drawn from the mechanistic model. One of the connectivity motifs identified appears to be i+ to e- and i- to e+, where potentially i+/i- are SOM and VIP (or really any specific inhibitory type) cells. The specific connections to subsets of excitatory cells appear to be important (based on the solid lines in Figure 8). This seems surprising: is there any experimental support for excitatory cells to preferentially receive inhibition from either SOM or VIP, but not both? More broadly, I wonder if the neat diagrams drawn here are misleading. The sample raster, showing what appears to be the full simulation, certainly captures the correlated/anti-correlated pattern of the 100 cells most correlated with a seed cell and 100 cells most anti-correlated with it, but it does not contain the 11,000 cells in between with zero to moderate levels of correlation. We probably expect that the full covariance matrix has similar structure from any seed (see Meshulam et al. 2019, PRL, for an analysis of scaling of coarse-grained activity covariance), and this suggests multiple cross-over inhibition constraints, which seem like they could be hard to satisfy. The motifs identified in Fig. 8 likely exist, but I am left with many questions of what we learned about connectivity rules that would account for the full distribution of correlations. Would starting with an Erdos-Renyi network with slight over-representation of these motifs be sufficient? How important is the homogeneous connection weights from each pool assumption - would allowing connection weights with some dispersion change the results?

As a whole, this paper has the potential to make an impact on how large-scale neural and behavioral recordings are analyzed and interpreted, which is of high interest to a large contingent of the field.

5. Reviewer #3 (Public Review):

The primary goal of this work is to link scale free dynamics, as measured by the distributions of event sizes and durations, of behavioral events and neuronal populations. The work uses recordings from Stringer et al. and focus on identifying scale-free models by fitting the log-log distribution of event sizes. Specifically, the authors take averages of correlated neural sub-populations and compute the scale-free characterization. Importantly, neither the full population average nor random uncorrelated subsets exhibited scaling free dynamics, only correlated subsets. The authors then work to relate the characterization of the neuronal activity to specific behavioral variables by testing the scale-free characteristics as a function of correlation with behavior. To explain their experimental observation, the authors turn to classic e-i network constructions as models of activity that could produce the observed data. The authors hypothesize that a winner-take-all e-i network can reproduce the activity profiles and therefore might be a viable candidate for further study. While well written, I find that there are a significant number of potential issues that should be clarified. Primarily I have main concerns: 1) The data processing seems to have the potential to distort features that may be important for this analysis (including missed detections and dynamic range), 2) The analysis jumps right to e-i network interactions, while there seems to be a much simpler, and more general explanation that seems like it could describe their observations (which has to do with the way they are averaging neurons), and 3) that the relationship between the neural and behavioral data could be further clarified by accounting for the lop-sidedness of the data statistics. I have included more details below about my concerns below.

Main points:
1)Limits of calcium imaging: There is a large uncertainty that is not accounted for in dealing with smaller events. In particular there are a number of studies now, both using paired electro-physiology and imaging [R1] and biophysical simulations [R2] that show that for small neural events are often not visible in the calcium signal. Moreover, this problem may be exacerbated by the fact that the imaging is at 3Hz, much lower than the more typical 10-30Hz imaging speeds. The effects of this missing data should be accounted for as could be a potential source of large errors in estimating the neural activity distributions.

1. Correlations and power-laws in subsets. I have a number of concerns with how neurons are selected and partitioned to achieve scale-free dynamics.
2a) First, it's unclear why the averaging is required in the first place. This operation projects the entire population down in an incredibly lossy way and removes much of the complexity of the population activity.
2b) Second, the authors state that it is highly curious that subsets of the population exhibit power laws while the entire population does not. While the discussion and hypothesizing about different e-i interactions is interesting I believe that there's a discussion to be had on a much more basic level of whether there are topology independent explanations, such as basic distributions of correlations between neurons that can explain the subnetwork averaging. Specifically, if the correlation to any given neuron falls off, e.g., with an exponential falloff (i.e., a Gaussian Process type covariance between neurons), it seems that similar effects should hold. This type of effect can be easily tested by generating null distributions using code bases such as [R3]. I believe that this is an important point, since local (broadly defined) correlations of neurons implying the observed subnetwork behavior means that many mechanisms that have local correlations but don't cluster in any meaningful way could also be responsible for the local averaging effect.
2c) In general, the discussion of "two networks" seems like it relies on the correlation plot of Figure~7B. The decay away from the peak correlation is sharp, but there does not seem to be significant clustering in the anti-correlation population, instead a very slow decay away from zero. The authors do not show evidence of clustering in the neurons, nor any biophysical reason why e and i neurons are present in the imaging data. The alternative explanation (as mentioned in (b)) is that the there is a more continuous set of correlations among the neurons with the same result. In fact I tested this myself using [R3] to generate some data with the desired statistics, and the distribution of events seems to also describe this same observation. Obviously, the full test would need to use the same event identification code, and so I believe that it is quite important that the authors consider the much more generic explanation for the sub-network averaging effect.
2d) Another important aspect here is how single neurons behave. I didn't catch if single neurons were stated to exhibit a power law. If they do, then that would help in that there are different limiting behaviors to the averaging that pass through the observed stated numbers. If not, then there is an additional oddity that one must average neurons at all to obtain a power law.

2. There is something that seems off about the range of \beta values inferred with the ranges of \tau and $\alpha$. With \tau in [0.9,1.1], then the denominator 1-\tau is in [-0.1, 0.1], which the authors state means that \beta (found to be in [2,2.4]) is not near \beta_{crackling} = (\alpha-1)/(1-\tau). It seems as this is the opposite, as the possible values of the \beta_{crackling} is huge due to the denominator, and so \beta is in the range of possible \beta_{crackling} almost vacuously. Was this statement just poorly worded?

3. Connection between brain and behavior:
4a) It is not clear if there is more to what the authors are trying to say with the specifics of the scale free fits for behavior. From what I can see those results are used to motivate the neural studies, but aside from that the details of those ranges don't seem to come up again.
4b) Given that the primary connection between neuronal and behavioral activity seems to be Figure~4. The distribution of points in these plots seem to be very lopsided, in that some plots have large ranges of few-to-no data points. It would be very helpful to get a sense of the distribution of points which are a bit hard to see given the overlapping points and super-imposed lines.
4c) Neural activity correlated with some behavior variables can sometimes be the most active subset of neurons. This could potentially skew the maximum sizes of events and give behaviorally correlated subsets an unfair advantage in terms of the scale-free range.