STRIDE: A Sequencing Depth-Insensitive Metric for Robust Comparison between Sparse Chromosome Conformation Capture Data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The study of three-dimensional (3D) genome organization has been revolutionized by high-throughput chromatin conformation capture technologies (Hi-C) or its derivatives. However, the dependency on sequencing depth severely restricts the reliability and accuracy of existing Hi-C tools, especially in single-cells. To address this issue, we introduce a novel computational framework based on the mean first passage time (MFPT) in Markov chain theory, which transforms chromatin contact matrices into a robust, distance-based representation. We demonstrate that MFPT representation is inherently insensitive to sequencing depth. Leveraging this transformation, we develop STRIDE (Spatial Topological Representation of Interaction Distance Evaluation), a parameter-free metric for library similarity. STRIDE is resilience to sparse and noisy data, insensitive to technical variabilities, and facilitates unsupervised embedding from single-cell Hi-C, enabling accurate delineation of cellular states and developmental trajectories. In conclusion, as a reliable computational framework for sparse and noisy data, STRIDE may serve as a base for wide range single-cell 3D genome analysis that used to be inconceivable.