GraphMana: graph-native data management for population genomics projects

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Population genomics projects rely on fragmented file-based workflows that lose provenance and require full reprocessing when samples are added. Graph-Mana stores variant data in a graph database as packed genotype arrays with pre-computed population statistics, enabling incremental sample addition, provenance tracking, cohort management, and export to 17 formats. Two access paths serve different needs: a FAST PATH reading population-level arrays in O ( K ) time and a FULL PATH unpacking per-sample genotypes in O ( N ) time. On human 1000 Genomes data (3,202 samples, 70.7M variants), Graph-Mana completed a 46-operation lifecycle in 98 minutes from a single persistent database.

Article activity feed