GeneSNAKE: a Python package for benchmarking and simulation of gene regulatory networks and perturbation-induced expression data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Understanding how genes interact with and regulate each other is a key challenge in systems biology. One of the primary methods to study this is through gene regulatory networks (GRNs). The field of GRN inference however faces many challenges, which necessitates effective tools for evaluating inference methods. For this purpose, data that corresponds to a known GRN, from various conditions and experimental setups is necessary, which is only possible to attain via simulation. Today, most existing tools for GRN-based simulation are limited either in network or data properties, with few or no options to modify these properties.
To address these limitations we present GeneSNAKE, a Python package designed to allow users to generate biologically realistic GRNs, and expression data for benchmarking purposes. GeneSNAKE allows the user to control a wide range of network and data properties, including several distinct noise models. GeneSNAKE improves on previous work in the field by adding a perturbation model and a wide range of perturbation schemes along with the ability to control the noise and the perturbation strength.
For benchmarking, GeneSNAKE offers a number of functions both for comparing network similarity, and properties in data and GRNs. These functions can further be used to study properties of biological data to produce simulated data with more realistic properties. GeneSNAKE is an open-source, comprehensive simulation and benchmarking package with powerful capabilities that are not combined in any other single package, and thanks to the Python implementation it can be extended and modified by users.