NetFlow: A Framework to Explore Topological Representations of High-Dimensional Biomedical Data

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Biomedical datasets are increasingly high-dimensional and multi-modal, theoretically enabling a more comprehensive understanding of biological systems. However, common clustering and dimensionality reduction methods often oversimplify continua and obscure complex, biologically relevant patterns, hindering our ability to meaningfully probe sample relationships. To address this challenge, we developed NetFlow, a computational framework that constructs graph representations of structural relationships between samples by integrating a data-driven, lineage-tracing-inspired pseudo-ordering with sparse similarity networks. NetFlow is suitable for small cohorts, features a flexible, modular design, supports multi-modal data fusion, and provides interactive visualization, enabling nuanced exploration of heterogeneity across biological samples. Applied across uni- and multi-modal cancer datasets, NetFlow refined clinically relevant subtypes, identified biomarker associations, and revealed more informative and structured relational patterns when benchmarked against current approaches. NetFlow thus delivers an interpretable graph representation of sample relatedness that supports precision oncology, hypothesis generation, and general high-dimensional data analysis.

Article activity feed