Inferring and Evaluating Network Medicine-Based Disease Modules with Nextflow
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Most human diseases result from complex molecular interactions of genes and proteins. Various network-based computational methods characterize these mechanisms by expanding seed genes into disease-associated subnetworks, or disease modules.
Evaluating these diverse methods is tedious due to unique installation and data preparation requirements. Moreover, the underlying algorithmic strategies differ, making it difficult to determine which of the created modules are most useful or biologically plausible.
To address this challenge, we developed an all-in-one Nextflow pipeline that enables automated and reproducible analyses. It handles installation, input preparation, execution, and systematic evaluation of six widely used module detection tools, considering module topology, functional coherence, robustness, and the capacity to recover seeds. In addition, it annotates the resulting disease modules with biological context information, prioritizes potential drug candidates, and generates visualizations and a comprehensive summary report.
To showcase the value of our pipeline and offer guidance to potential users, we performed a comprehensive evaluation across 50 different disease-network combinations, revealing substantial variability among the derived disease modules. We show that this variability is driven by differences in modeling approach, input network, and seed composition. While most methods are robust to minor perturbations, they struggle to recover omitted seeds, and none consistently outperforms others, underscoring the need for careful method selection.
Our work enables the research community to systematically compare approaches for disease module discovery, promoting reproducible network medicine research. Integrated into the nf-core project ( https://nf-co.re/diseasemodulediscovery ), it is intended as an extendable, long-term resource for tracking progress in the field.