On the variability of dynamic functional connectivity assessment methods

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Background

Dynamic functional connectivity (dFC) has become an important measure for understanding brain function and as a potential biomarker. However, various methodologies have been developed for assessing dFC, and it is unclear how the choice of method affects the results. In this work, we aimed to study the results variability of commonly used dFC methods.

Methods

We implemented 7 dFC assessment methods in Python and used them to analyze the functional magnetic resonance imaging data of 395 subjects from the Human Connectome Project. We measured the similarity of dFC results yielded by different methods using several metrics to quantify overall, temporal, spatial, and intersubject similarity.

Results

Our results showed a range of weak to strong similarity between the results of different methods, indicating considerable overall variability. Somewhat surprisingly, the observed variability in dFC estimates was found to be comparable to the expected functional connectivity variation over time, emphasizing the impact of methodological choices on the final results. Our findings revealed 3 distinct groups of methods with significant intergroup variability, each exhibiting distinct assumptions and advantages.

Conclusions

Overall, our findings shed light on the impact of dFC assessment analytical flexibility and highlight the need for multianalysis approaches and careful method selection to capture the full range of dFC variation. They also emphasize the importance of distinguishing neural-driven dFC variations from physiological confounds and developing validation frameworks under a known ground truth. To facilitate such investigations, we provide an open-source Python toolbox, PydFC, which facilitates multianalysis dFC assessment, with the goal of enhancing the reliability and interpretability of dFC studies.

Article activity feed

  1. Dynamic functional connectivity (dFC) has become an important measure for understanding brain function and as a potential biomarker. However, various methodologies have been developed for assessing dFC, and it is unclear how the choice of method affects the results. In this work, we aimed to study the results variability of commonly-used dFC methods. We implemented seven dFC assessment methods in Python and used them to analyze fMRI data of 395 subjects from the Human Connectome Project. We measured the pairwise similarity of dFC results using several similarity metrics in terms of overall, temporal, spatial, and inter-subject similarity. Our results showed a range of weak to strong similarity between the results of different methods, indicating considerable overall variability. Surprisingly, the observed variability in dFC estimates was comparable to the expected natural variation over time, emphasizing the impact of methodological choices on the results. Our findings revealed three distinct groups of methods with significant inter-group variability, each exhibiting distinct assumptions and advantages. These findings highlight the need for multi-analysis approaches to capture the full range of dFC variation. They also emphasize the importance of distinguishing neural-driven dFC variations from physiological confounds, and developing validation frameworks under a known ground truth. To facilitate such investigations, we provide an open-source Python toolbox that enables multi-analysis dFC assessment. This study sheds light on the impact of dFC assessment analytical flexibility, emphasizing the need for careful method selection and validation, and promoting the use of multi-analysis approaches to enhance reliability and interpretability of dFC studies.Competing Interest StatementThe authors have declared no competing interest.

    Reviewer 2. Nicolas Farrugia

    Comments to Author: Summary of review This paper fills a very important gap in the literature investigating time-varying functional connectivity (or dynamic functional connectivity, dFC), by measuring analytical flexibility of seven different dFC methods. An impressive amount of work has been put up to generate a set of convincing results, that essentially show that the main object of interest of dFC, which is the temporal variability of connectivity, cannot be measured with a high consistency, as this variability is of the same order of magnitude or even higher than the changes observed across different methods on the same data. In this very controversial field, it is very remarkable to note that the authors have managed to put together a set of analysis to demonstrate this in a very clear and transparent way. The paper is very well written, the overall approach is based on a few assumptions that make it possible to compare methods (e.g. subsampling of temporal aspects of some methods, spatial subsampling), and the provided analysis is very complete. The most important results are condensed in a few figures in the main manuscript, which is enough to convey the main messages. The supplementary materials provide an exhaustive set of additional results, which are shortly discussed one by one. Most importantly, the authors have provided an open source implementation of 7 main dfc methods. This is very welcome for the community and for reproductibility, and is of course particularly suited for this kind of contribution. A few suggestions follow. Clarification questions and suggestions : 1- How was the uniform downsampling of 286 ROI to 96 done ? Uniform in which sense ? According to the RSN ? Were ROIs regrouped with spatial contiguity ? I understand this was done in order to reduce computational complexity and to harmonize across methods, but the manuscript would benefit from having an added sentence to explain what was done. 2- Table A in figure 1 shows the important hyperparameters (HP) for each method, but the motivations regarding the choice of HP for each method is only explained in the discussion (end of page 11, "we adopted the hyperparameter values recommended by the original paper or consensus among the community for each method"). It would be better to explain it in the methods, and then only discuss why this can be a limitation, in the discussion. 3- The github repository https://github.com/neurodatascience/dFC/tree/main does not reference the paper 4- The github repository https://github.com/neurodatascience/dFC/tree/main is not documented enough. There are two very large added values in this repo : open implementation of methods, and analytical flexibility tools. The demo notebook shows how to use the analytical flexibility tools, but the methods implementation is not documented. I expect that many people will want to perform analysis using the methods as well as comparison analysis, so the documentation of individual methods should not be minimized. 5 - For the reader, it would be better to include early in the manuscript (in the introduction) the presence of the code for reproductibility. Currently, the toolbox is only introduced in the final paragraph of the discussion. It comes as a very nice suprise when reading the manuscript in full, but I think the manuscript would gain a lot of value if this paragraph was included earlier, and if the development of the toolbox was included much earlier (ie. in the abstract). 6 - We have published two papers on dFC that the authors may want to include, although these papers have investigated cerebello-cerebral dFC using whole brain + cerebellum parcellations. The first paper used continuous HMM on healthy subjects, and found correlations with impulsivity scores, while the second papers used network measures on sliding window dFC matrices on a clinical cohort (patients with alcohol use disorder). I am not sure why the authors have not found our papers in their litterature, but maybe it would be good to include them. Authors need to update the final table in supplementary materials as well as the citations in the main paper. Abdallah, M., Farrugia, N., Chirokoff, V., & Chanraud, S. (2020). Static and dynamic aspects of cerebro-cerebellar functional connectivity are associated with self-reported measures of impulsivity: A resting-state fMRI study. Network Neuroscience, 4(3), 891-909. Abdallah, M., Zahr, N. M., Saranathan, M., Honnorat, N., Farrugia, N., Pfefferbaum, A., Sullivan, E. & Chanraud, S. (2021). Altered cerebro-cerebellar dynamic functional connectivity in alcohol use disorder: a resting-state fMRI study. The Cerebellum, 20, 823-835. Note that in Abdallah et al. (2020), while we did not compare HMM results with other dFC methods, we did investigate the influence of HMM hyperparameters, as well as perform internal cross validation on our sample + null models of dFC.

    Minor comments 6 - "[..] what lies behind the of methods. Instead, they reveal three groups of methods, 720 variations in dynamic functional connectivity?. " -> an extra "." was added (end of page 10).

  2. AbstractDynamic functional connectivity (dFC) has become an important measure for understanding brain function and as a potential biomarker. However, various methodologies have been developed for assessing dFC, and it is unclear how the choice of method affects the results. In this work, we aimed to study the results variability of commonly-used dFC methods. We implemented seven dFC assessment methods in Python and used them to analyze fMRI data of 395 subjects from the Human Connectome Project. We measured the pairwise similarity of dFC results using several similarity metrics in terms of overall, temporal, spatial, and inter-subject similarity. Our results showed a range of weak to strong similarity between the results of different methods, indicating considerable overall variability. Surprisingly, the observed variability in dFC estimates was comparable to the expected natural variation over time, emphasizing the impact of methodological choices on the results. Our findings revealed three distinct groups of methods with significant inter-group variability, each exhibiting distinct assumptions and advantages. These findings highlight the need for multi-analysis approaches to capture the full range of dFC variation. They also emphasize the importance of distinguishing neural-driven dFC variations from physiological confounds, and developing validation frameworks under a known ground truth. To facilitate such investigations, we provide an open-source Python toolbox that enables multi-analysis dFC assessment. This study sheds light on the impact of dFC assessment analytical flexibility, emphasizing the need for careful method selection and validation, and promoting the use of multi-analysis approaches to enhance reliability and interpretability of dFC studies.

    This work has been published in GigaScience Journal under a CC-BY 4.0 license (https://doi.org/10.1093/gigascience/giae009), and has published the reviews under the same license. These are as follows.

    Reviewer 1: Yara Jo Toenders

    Comments to Author: The authors performed an in-depth comparison of 7 dynamic functional connectivity methods. The paper includes many figures that are greatly appreciated as they clearly demonstrate the findings. Moreover, the authors developed a Python toolbox to implement these 7 methods. The results showed that the results were highly variable, although three clusters of similar methods could be detected. However, after reading the manuscript, there are some remaining questions.

    • The TR and timepoints of the fMR images are shown, but other acquisition parameters such as the voxel size are missing. Could all acquisition parameters please be provided?
    • Could more information be provided on the downsampling of the 286 to 96 ROIs? How was this done and what were the 96 ROIs that were created?
    • In the results it is explained that the definition of groups depended on the cutoff value of the clustering, however it is unclear how the cutoff value was determined. Could the authors elucidate this how this was done?
    • The difference between the subplots in Figure 3 is a bit difficult to understand because the labels of the different methods switch places. Perhaps the same colour could be used for the cluster of the continuous HMM, Clustering and Discrete HMM method to increase readability?
    • Figure 4b shows that the default mode network is more variable over methods than time, while the auditory and visual are not. Could the authors explain what may underlie this discrepancy?
    • From the introduction it became clear that many studies have used dFC to study clinical populations, while I understand that no single recommendation can be given, not every clinical study might have the capacity to use all 7 methods. What would the authors recommend these clinical studies? Would there for example be a method that would be recommended within each of the three clusters?
    • It could be helpful if the authors create DOIs for their toolbox code bases that could be cited in a manuscript, rather than linking to bare GitHub URLs. One potentially useful guide is: https://guides.github.com/activities/citable-code/