Fast Phenotype Simulation for Genotype Representation Graphs
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Motivation
The Genotype Representation Graph (GRG) [DeHaas et al., 2025] is a graph representation of whole genome polymorphisms, designed to encode the variant hard-call information in phased whole genomes. It encodes the geno-types as an extremely compact graph that can be traversed efficiently, enabling dynamic programming-style algorithms on applications such as genome-wide association studies that run faster on biobank-scale data than existing alternatives. To facilitate scalable statistical genetics, we present GrgPhenoSim , an extremely fast phenotype simulator for GRGs, suitable for simulating phenotypes on biobank-scale datasets.
Results
GrgPhenoSim contains all the primary functionalities of a phenotype simulator, uses a standardized output, and supports customized simulations. Grg-PhenoSim is dozens to hundreds of times faster than tstrait [Tagami et al., 2024], a fast ancestral recombination graph-based phenotype simulator, when the sample size ranges from thousands to hundreds of thousands samples .
Availability
The GrgPhenoSim library and use-case demonstrations are available at https://github.com/aprilweilab/grg_pheno_sim
The documentation for GrgPhenoSim is hosted at https://grgl.readthedocs.io/en/latest/index.html