Network Accuracy Across Local, Mesoscale, and Global Structures using Stochastic Block Models

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Researchers apply network psychometric methods across many levels of analysis—from understanding causal pathways (edges) and symptoms that drive comorbidity (local) to uncovering community structures (mesoscale) and group differences (global). The validity of inferences at each level depend on the accuracy of network estimation methods. To date, simulation studies have focused almost exclusively on edge recovery using data generating mechanisms that lack known community structures. In this large-scale simulation study, we evaluated four network estimation methods (EBICglasso variants, GGMmod, GGMncv, GGMnonreg) across four levels of analysis (edge, local, mesoscale, global) using Stochastic Block Models with empirically-informed edge weight distributions derived from 293 psychological networks. We generated continuous, polytomous, and dichotomous data across various conditions to provide concrete guidelines for applied researchers. Overall, regularization methods (EBICglasso) excelled in smaller samples (≤ 1,000) and dichotomous data whereas non-convex (GGMncv) and non-regularized (GGMnonreg) methods excelled in larger samples (≥ 2,500) and continuous data, reflecting a trade-off between stability and desirable asymptotic properties. Across conditions, GGMmod had the most balanced performance, making it the recommended general purpose method. Critically, local measure accuracy varied substantially, with network loadings being most robust and bridge strength being most unreliable (even in ideal conditions). We established interpretation benchmarks for global metrics (Frobenius norm and Jensen-Shannon Similarity) and identified community detection as robust except under challenging conditions (small samples and dichotomous data). Our results demonstrate that estimation methods should be selected based on sample characteristics, data type, and the level of analysis most relevant to the researcher’s aims.

Article activity feed