Discovering conserved regulatory modules in predicted gene regulatory networks across species
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The discovery of conserved regulatory motifs across different species is a fundamental challenge in systems biology, especially considering the noisy and incomplete nature of predicted gene regulatory networks (GRNs) and the intractability of the underlying graph alignment problem. Traditional network alignment methods frequently enforce one-to-one node mappings or strict topological isomorphism, which fail to accommodate the many-to-many orthology mappings caused by evolutionary gene duplication. Consequently, strict constraints often yield highly fragmented topological islands rather than cohesive functional modules. In this work, we propose a relaxed topological alignment algorithm designed to extract conserved regulatory structures from cross-species GRNs. We formulate the discovery process as a multi-objective optimization problem that balances sequence homology, functional coherence, and a normalized topological consensus. To navigate the exponentially scaling search space, we introduce a greedy seed-and-extend heuristic bounded by a dynamic ϵ -stopping condition, which evaluates marginal objective gains to prevent functional dilution. We validate our algorithm using time-series transcriptomic data from Arabidopsis thaliana, Zea mays , and Sorghum bicolor focused on drought and developmental stress responses. While a strict topological baseline extracted only fragmented subgraphs limited to 51 homologous tuples, our relaxed heuristic successfully converged on a highly connected 444-tuple module. The resulting topology effectively links strictly conserved upstream transcription factors to their highly duplicated, species-specific downstream pathways. Our algorithm provides a robust, scalable computational methodology for identifying core regulatory logic across complex biological systems, facilitating the translation of conserved network architectures among multiple species.
Author summary
Identifying shared regulatory mechanisms across diverse species is essential for understanding how complex biological systems evolve and adapt. However, traditional computer algorithms struggle to align these biological networks because evolution frequently duplicates genes, breaking simple one-to-one comparisons and producing highly fragmented results. To overcome this limitation, we developed a relaxed cross-species network alignment algorithm. Instead of demanding perfectly identical network shapes, our approach dynamically balances genetic sequence similarity, network structure, and biological function. We demonstrated the performance of our algorithm using plant drought-stress networks as a case study. While strict methods only found tiny, disconnected network fragments, our algorithm uncovered a functionally coherent, interconnected regulatory module across three distinct species. We discovered that while upstream command genes remain strictly conserved, they regulate highly customized, species-specific execution pathways downstream. Ultimately, our framework provides a scalable, species-agnostic method to decode complex systems, allowing researchers to translate conserved biological logic across diverse genomes.