Network Accuracy Across Levels of Analysis Using Stochastic Block Models
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Researchers apply network psychometric methods across many levels of analysis—from understanding causal pathways (edges) and variable importance (centrality) to uncovering dimensionality (community) and group differences (network similarity). The validity of inferences at each level depends on the accuracy of network estimation methods, yet simulation studies have focused almost exclusively on edge recovery using data generating mechanisms that lack known community structures. In this large-scale simulation study, we evaluated four network estimation methods (EBICglasso variants, ggmModSelect, GGMncv, GGMnonreg) across four levels of analysis using Stochastic Block Models with empirically-informed edge weight distributions derived from 293 empirical networks in psychology. For most levels, large samples (≥ 1,000) were necessary for adequate recovery, echoing concerns about sufficient evidence for inferences drawn by many published studies. The EBICglasso variants outperformed the other methods in smaller samples (≤ 1,000) and dichotomous data, though, these conditions demonstrated inadequate recovery across methods (including EBICglasso). Most methods achieved adequate accuracy in larger samples (≥ 2,500) and continuous data. Centrality accuracy varied substantially: network loadings were reliably recovered across all conditions whereas bridge strength was unreliable even under ideal conditions. For node and bridge strength, we recommend reporting raw values rather than rankings, as rank-order recovery required substantially larger samples. We also established interpretation benchmarks for network similarity metrics (sF and Jensen-Shannon Similarity) and found community detection to be robust except under challenging conditions (small samples and dichotomous data). Collectively, these findings provide concrete, level-specific guidance for method and measure selection based on sample size and data categories.