Tractometry reproducibility and generalizability across scanners, scanner models, and acquisition protocols
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Diffusion-weighted magnetic resonance imaging (dMRI)-based tractometry enables the quantification of white matter tissue properties in living humans while preserving anatomical specificity. Although tractometry is highly reproducible when the same scanner and acquisition protocol are used, its generalizability across scanners and protocols remains unclear. To address this gap, we performed a traveling-head experiment involving five subjects to evaluate tractometry across progressively different acquisition conditions, including multiple scanners, different scanner models, and two distinct protocols. Tractometry was performed for 20 major white matter tracts using diffusion tensor imaging metrics, neurite orientation dispersion and density imaging (NODDI) metrics, and a semi-quantitative ratio metric (T1w/b0). Generalizability across dataset pairs was quantified using the intraclass correlation coefficient (ICC). Tractometry showed consistently high ICCs when the scanner and protocol were identical; however, ICCs declined as differences in scanner model and acquisition protocol increased. Fractional anisotropy and orientation dispersion index retained relatively high ICCs across these comparisons, whereas other metrics showed marked declines when scanners or protocols differed. ComBat harmonization partially mitigated these declines, but ICCs did not reach the levels observed for datasets acquired using identical scanners and protocols. Finally, the minimum detectable change (MDC) for tractometry in datasets pooled across scanners and protocols varied by tract; for example, the optic radiation showed a lower MDC than the cingulum hippocampus. These findings highlight both the strengths and limitations of tractometry in multisite studies and highlight the importance of quantifying scanner- and protocol-dependent effects for specific metrics and tracts when interpreting measurements from heterogeneous datasets.