Analyzing Co-expression Networks with Network Skeleton Extraction
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background : A central question in systems biology is to infer how genes work together to perform a specific function. While publicly available gene regulatory networks are commonly used to analyze experimental data, these networks may contain inaccuracies or inconsistencies across databases, especially for less well-studied organisms or systems where regulatory networks change over time. Data-driven network reconstruction from co-expression data is an appealing alternative, but requires sparsification to yield biologically representative networks. Threshold-based sparsification can lead to an excessively fragmented network, masking the relationship between genes and neglecting the ''strength of weak ties'', wherein a critical but weak relationship may serve as a vital link between two subnetworks. Results : Here we present Network Skeleton Extraction (NSE), a co-expression network generation method using spectral sparsification to sparsify co-expression statistics into minimal co-expression graphs. Spectral sparsification has the advantage of maintaining connections among genes in a manner that preserves the coarse-grained structure of the input graph. In our method, the degree of sparsification is optimized by predicting each gene's expression as a function of its connected genes at each sparsification level. This yields networks that are maximally sparse while still being predictive of gene expression. A probabilistic model also provides a null distribution of networks with similar spectral properties against which inferred networks can be compared. We illustrate the method by applying it to Xenopus transcriptome data from four cell types and six developmental stages to obtain networks specific to the organism, cell type, and developmental stage. Conclusions : By applying NSE to pre-defined gene sets in a phenotype-conditional manner, we can identify pathways whose coordination differs significantly across cell types and developmental stages.