Granger causality analysis for calcium transients in neuronal networks, challenges and improvements

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment

    This manuscript provides a solid and valuable analysis of the advantages and potential pitfalls of the application of Granger Causality to calcium imaging data that should be of interest to a wide range of neuroscience researchers. Granger Causality is a key tool in assessing the temporal relationships between variables, but one that is susceptible to many types of artifacts. The rigor of the authors' application of the methodology to calcium imaging data in particular leads to a more robust understanding of the effects of measurement artifacts and analysis choices on the results. There was some concern, however, about whether all of the findings would apply outside feedforward neural circuits and it was unclear how some of the results relate to others that currently exist in the literature.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

One challenge in neuroscience is to understand how information flows between neurons in vivo to trigger specific behaviors. Granger causality (GC) has been proposed as a simple and effective measure for identifying dynamical interactions. At single-cell resolution however, GC analysis is rarely used compared to directionless correlation analysis. Here, we study the applicability of GC analysis for calcium imaging data in diverse contexts. We first show that despite underlying linearity assumptions, GC analysis successfully retrieves non-linear interactions in a synthetic network simulating intracellular calcium fluctuations of spiking neurons. We highlight the potential pitfalls of applying GC analysis on real in vivo calcium signals, and offer solutions regarding the choice of GC analysis parameters. We took advantage of calcium imaging datasets from motoneurons in embryonic zebrafish to show how the improved GC can retrieve true underlying information flow. Applied to the network of brainstem neurons of larval zebrafish, our pipeline reveals strong driver neurons in the locus of the mesencephalic locomotor region (MLR), driving target neurons matching expectations from anatomical and physiological studies. Altogether, this practical toolbox can be applied on in vivo population calcium signals to increase the selectivity of GC to infer flow of information across neurons.

Article activity feed

  1. eLife assessment

    This manuscript provides a solid and valuable analysis of the advantages and potential pitfalls of the application of Granger Causality to calcium imaging data that should be of interest to a wide range of neuroscience researchers. Granger Causality is a key tool in assessing the temporal relationships between variables, but one that is susceptible to many types of artifacts. The rigor of the authors' application of the methodology to calcium imaging data in particular leads to a more robust understanding of the effects of measurement artifacts and analysis choices on the results. There was some concern, however, about whether all of the findings would apply outside feedforward neural circuits and it was unclear how some of the results relate to others that currently exist in the literature.

  2. Reviewer #1 (Public Review):

    This manuscript provides an in-depth analysis of the advantages and potential pitfalls of the application of Granger Causality (GC) to calcium imaging data, especially regarding various types of pre-processing. The key strength of the manuscript is the rigor and thoroughness of the authors' approach, and it is very clear how one would go about replicating their work. On the other hand, it is not from the results how well one should trust the results of GC for an unknown system, as many results rely on having some specialized knowledge about the measurements beforehand.

    Strengths:

    - Understanding how to measure causality is a key problem in modern science, and with the increasing abundance of wide-field calcium imaging, understanding how to assess information flow between neurons from these data is of wide interest and importance.

    - I was impressed by the rigor and explicitness of the authors' approach. In papers like this, there is the temptation to sweep problems under the rug and highlight the successes. Here, the authors present, in a clearly organized format, the effects of various methods and analysis decisions. Moreover, the methods are described in a manner such that they could be (relatively) easily implemented by the reader.

    - In general, the approach of using the GC value of the F-statistics and then normalizing by a null model is an appealing method that has a lot of intuitive and quantitative value.

    Weaknesses:

    - It's not clear to me what lessons are specific to the system they are studying and which ones are to be taken as more general lessons. Certainly, dealing with slow calcium dynamics, motion artifacts, and smoothing, are general problems in calcium imaging, but I found myself puzzled a bit about how to decide which neurons are "strange" without a lot of system-specific knowledge. This seems to be a rather important effect, and having a bit more guidance in the discussion would be useful.

    - Somewhat related, I'm not entirely sure what results I should take home from the hindbrain analysis. It is clear that there is a more-or-less global signal modulating all neural activity, but this is a common occurrence in population recordings (often, one subtracts this off via PCA or another means before proceeding). Is the general lack of causal links (via the MVGC at least) a generic phenomenon in recurrent networks, or is there something more system-specific here? Accordingly, it might be interesting to run a recurrent neural network simulation with similar properties to the hindbrain (and perhaps with correlated driving) to see what GC/MVGC would predict. Is there any hope of these methods finding information flow in recurrent networks, or should we restrict the method to networks where we expect the primary mode of information transmission to be feedforward?

  3. Reviewer #2 (Public Review):

    The authors consider the application of Granger causality (GC) analysis to calcium imaging data and identify several challenges therein and provide methodological approaches to address them. In particular, they consider case studies involving fluorescence recordings from the motoneurons in embryonic zebrafish and the brainstem and hindbrain of larval zebrafish to demonstrate the utility of the proposed solutions in removing the spurious links that the naive GC identifies.

    The paper is well-written and the results on the chosen case studies are compelling. However, the proposed work would benefit from discussing the contributions of this work in the context of existing and relevant literature and clarifying some of the methodological points that require more rigorous treatment. I have the following comments:

    Major comments:

    1. I would like to point out recent literature that adapts the classical GC for both electrophysiology data and calcium imaging data:

    [1] A. Sheikhattar et al., "Extracting Neuronal Functional Network Dynamics via Adaptive Granger Causality Analysis", PNAS, Vol. 115, No. 17, E3869-E3878, 2018.

    [2] N. A. Francis et al., "Small Networks Encode Decision-Making in Primary Auditory Cortex", Neuron, Vol. 97, No. 4, 2018.

    [3] N. A. Francis et al., "Sequential Transmission of Task-Relevant Information in Cortical Neuronal Networks", Cell Reports, Vol. 39, No. 9, 110878, 2022.

    In reference [1], a variation of GC based on GLM log-likelihoods is proposed that addresses the issues of non-linearity, non-stationarity, and non-Gaussianity of electrophysiology data. In [2] and [3], a variation of GC using sparse multi-variate models is introduced with application to calcium imaging data. In particular, all three references use the sparse estimation of the MVAR parameters in order to mitigate overfitting and also use corrections for multiple comparisons that also reduce the number of spurious links (see my related comments below). I suggest discussing these relevant references in the introduction (paragraphs 2 and 3) and discussion.

    1. A major issue of GC applied to calcium imaging data is that the trials are typically limited in duration, which results in overfitting of the MVAR parameters when using least squares (See references [2] and [3] above, for example). The authors mention on page 4 that they use least squares to estimate the parameters. However, for the networks of ~10 neurons considered in this work, stationary trials of a long enough duration are required to estimate the parameters correctly. I suggest that the authors discuss this point and explicitly mention the trial durations and test whether the trial durations suffice for stable estimation of the MVAR parameters (this can be done by repeating some of the results on the synthetic data and using different trial lengths and then assessing the consistency of the detected GC links).

    2. The definition of the "knee" of the average GC values as a function of the lag L needs to be a bit more formalized. In Fig. 2H using the synthetic data, the "knee" effect is more clear, but in the real data shown in Fig. 2I, the knee is not obvious, given that the confidence intervals are quite wide. Is there a way to quantify the "knee" by comparing the average GC values as well as their confidence bounds along the lag axis?

    3. While the measures of W_{IC} and W_{RC} form suitable guiding principles for the pipeline presented in this work, it would be helpful if the authors discuss how such measures can be used for other applications of GC to calcium imaging data in which a priori information regarding the left/right symmetry or the rostrocaudal flow of information is missing.

    4. Removing the "strange" neurons discussed in Section C5 is definitely an important pre-processing step in applying GC. However, the criterion for identifying the strange neurons seems a bit ad hoc and unclear. Could this be done by clustering the neurons into several categories (based on their time courses) and then removing a "strange" cluster? Please clarify.

    5. Another key element of existing GC methods applied to large-scale networks is dealing with the issue of multiple comparisons: for instance, in Figures 2, 3, 4, 6, 7, and 8, it seems like all arrows corresponding to all possible links are shown, where the colormap indicates the GC value. However, when performing multiple statistical tests, many of these links can be removed by a correction such as the Benjamini-Hochberg procedure. It seems that the authors did not consider any correction of multiple comparisons; I suggest doing so and adding this to your pipeline.

    6. The authors use TV denoising and also mention that it is a global operator, and changes the values of a time series at time t based on both the past and future values of the process. As such, it is not clear how TV denoising could affect the "causal" relations of the time series. In particular, TV denoising would significantly change the \Gamma_{ii} coefficients in Eq. (8). Is it possible to apply a version of TV denoising that only uses the information from the past to denoise the process at time t? In other words, using a "filter" as opposed to a "smoother". Please clarify.

    7. The idea of using an adaptive threshold as in Section C8 is interesting; but this problem was previously considered in [30] (in the manuscript) and reference [1] above, in which new test statistics based on log-likelihoods are used that have well-known asymptotic null distributions (i.e., chi-square distributions). In particular, reference [1] above identifies and applies the required rescaling for the asymptotic null distributional assumptions to hold. I suggest discussing your work regarding the adaptive thresholds in the context of these existing results.

    8. Related to the previous comment, given that the authors use a shuffling procedure to obtain the null, it is not clear why fitting the F-distribution parametrically and using its quantiles for testing would provide further benefits. In fact, as shown in Figure S9B, the rescaled F-distribution does not fully match the empirical null distribution, so it may be worth using the empirical null to obtain the non-parametric quantiles for testing. Please clarify.

    9. In Figure 5C, the values of W_IC for the MV cases seem to be more than 1, whereas by definition they should be less than or equal to 1. Please clarify.

    10. Is there evidence that the lateralized and rostrocaudal connectivity of the motoneurons occurs at the time-scale of ~750 ms? Given that this time scale is long enough for multiple synapses, it could be the case that some contralateral and non-rostrocaudal connections could be "real", as they reflect multi-hop synaptic connections. Please clarify.

    11. While it is useful to see the comparison of the BV and MV cases shown in Figs. 1 and 2, given extensive evidence in the GC literature on the shortcomings of the BV version of GC, it seems unnecessary to report the BV results in Figs. 3 onward. I suggest discussing the shortcoming of the BV case when presenting figures 1 and 2 and removing the BV results from the subsequent results.

  4. Reviewer #3 (Public Review):

    This manuscript provides a helpful and transparent guide on the application of granger-causality (GC) to calcium datasets. This is a useful entry point toward understanding the suitability and limitations of GC to neural data. However, it is not entirely convincing that the variations of GC analysis provided in this manuscript can be effectively applied to large-scale calcium datasets without prior knowledge of the underlying circuit, especially when such networks are likely to contain redundancy and recurrent links.

    I would like to acknowledge that, at the outset, I held an unfavorable prior belief toward GC, for reasons that are well addressed in this manuscript, including the dangers of applying spectral GC to nonlinear networks, as well as a variety of pathologies that can undermine naive GC.

    The manuscript has been helpful, both for its effective presentation of both bivariate GC and its multivariate extension, as well as the practical considerations that are essential to applying it to real-life data. It was particularly helpful to see a treatment of the challenges and their possible resolutions. I commend the authors for their transparency - they should certainly be rewarded rather than punished for their transparency.

    Major
    1. Redundant signals: throughout the brain, it's expected that a population of neurons can encode the same information. It's unclear how GC (both the original and the modified versions) can handle this redundancy. Given how pervasive redundant signals are in the brain, this should be addressed in both simulation and experimental data. For example, in one of the manuscript's simulated networks, replace one neuron with 10 copies of it, each with identical inputs and outputs but with the weights scaled by 1/10. Such a network is functionally equivalent to the original but may pose some challenges for the various versions of GC. I believe this issue also accounts for the MVGC results in the hindbrain dataset. It might be more appropriate to apply GC to groups of neurons (as indeed the authors cited), instead of applying it at the single-cell level with redundant signals.
    2. Similarly, there is recurrent connectivity throughout the brain. The current manuscript appears to assume feedforward networks. Is the idea that GC cannot be applied to recurrent networks? If so, this needs to be clearly stated. If the authors believe that GC can recover casual links even in the presence of recurrent connectivity, this needs to be demonstrated.
    3. Both BVGC and MVGC appear to be extremely sensitive to any outlier signals. The most worrying aspect is that the authors developed their corrections and pipelines with the benefit of knowing the structure of the underlying system, whereas in the case where GC would be most useful, the user would be unable to rely on prior knowledge of the underlying structure. For instance, the motion artifact in Fig 3a-c was a helpful example of a vulnerability of naive GC, but one could easily imagine scenarios involving an unmeasured disturbance (e.g. the table is bumped) causing a similar artifact, but if the experimenter is unaware of such unmeasured disturbances then they will not be included in Z, and hence can result in the detection of widespread spurious links.
    There is a circularity here that's concerning. It seems that one already needs to have the answer (e.g. circuit connectivity) in order to clean up the data sufficiently for BVGC or MVGC to work effectively. Perhaps the authors would be interested in incorporating ideas from the systems identification literature, which can include the estimation of unmeasured disturbances, perhaps in conjunction with L1 regularization on the GC links. This is certainly out of scope for the present work, but it would be worth acknowledging the difficulties of unmeasured disturbances and deferring a general solution to future work. Similar considerations apply to a common unmeasured neuronal input (e.g. from a brain region not included in the field of view of the imaging).
    4. Interpretation - would it be correct to state that BVGC identifies plausible causal links, while MVGC identifies a plausible system-level model? I think these interpretations, carefully stated, might provide a helpful way of thinking about the two GC approaches. Taking the results of the paper together, neither BVGC nor MVGC is definitive - BVGC may overestimate the true number of causal links but MVGC is prone to a winner-take-all phenomenon that may represent just one of many plausible system-level models that can account for the observed data. This should be more clearly stated in the manuscript.
    5. "correlation completely misses the structure" - links are signed, so they should be shown with "bwr" colormap, with zero mapped to white (i.e. v_min is blue, 0 is white, v_max is red, |v_min| = |v_max|, this is natively supported in PyPlot and can be trivially implemented or downloaded in MATLAB). It is misleading that correlation appears to miss certain links marked in black, until one realizes that these links are inhibitory. It would substantially aid clarity and consistency if all panels followed this signed "bwr" convention. I think the emphasis for the GC panels is on whether links are detected, rather than the weight of the link, so I would suggest indicating detected inhibitory links as -1 (blue) and detected excitatory links as +1 (red), and link not detected as 0 (white).