A Fractal-Dimension Framework for Quantifying Self-Similarity in Chromatin Folding
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The three-dimensional folding of DNA is essential for genome function, but its organization remains difficult to summarize quantitatively across genomic scales. Here, we study DNA folding from Hi-C contact data using a network-based notion of fractal dimension. In this representation, genomic loci are treated as nodes, and observed Hi-C contacts define weighted edges, so that frequently interacting loci are closer in the resulting network. We then estimate fractal dimension using two complementary graph-based methods: the correlation dimension and the sandbox dimension.
Validation on synthetic networks shows that the proposed estimators detect clear scaling behavior in hierarchical fractal-like networks, while distinguishing them from networks with local clustering but no stable multiscale self-similarity. Applied to intrachromosomal Hi-C data from the IMR90 human cell line, the method reveals approximate linear scaling regimes on log–log plots, suggesting fractal-like organization in chromatin contact networks. At the chromosome level, estimated fractal dimension tends to increase with chromosome size: larger chromosomes often have dimensions closer to 3, consistent with more compact and space-filling organization, whereas shorter chromosomes tend to have lower dimensions, closer to 1, consistent with simpler and more open folding patterns. A sliding-window analysis at 5 kb resolution further shows that fractal organization varies substantially along chromosomes rather than remaining uniform across genomic position.
These results suggest that graph-based fractal dimension provides an interpretable summary of DNA folding complexity at both global and local scales. More broadly, the proposed framework offers a quantitative way to study multiscale genome organization from Hi-C data using tools from network geometry.