cuBayes: GPU accelerated FreeBayes that achieves 1-minute whole-genome SNV calling while maintaining algorithmic semantics

Anders Pitman
Cathy Yang
Yi Qiao

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Next-generation sequencing now produces whole-genome data in hours, but downstream variant calling remains a multi-hour to multi-day bottleneck that excludes genomic analysis from time-critical clinical settings. GPU acceleration offers a natural path forward — variant calling is inherently parallelizable across genomic positions — yet open-source infrastructure for porting existing algorithms to GPU hardware remains limited, leaving many widely-used tools without accelerated implementations. FreeBayes, a haplotype-based variant caller central to the 1000 Genomes Project and to multi-sample tumor evolution analyses, exemplifies this gap: it is natively single-threaded despite its algorithmic suitability for parallelization. We present cuBayes, a CUDA implementation of FreeBayes germline SNV calling that completes HG002 and HG004 2×250bp Illumina 60× whole-genome analysis in one minute (as opposed to hours if not days with manual region-based CPU parallelization) on a single NVIDIA RTX 6000 Ada GPU, while producing variant calls with 99.97% concordance to the CPU reference. cuBayes is structured around an atom/molecule architecture in which reusable functional units (BAM decompression, position-wise pileup, batch coordination) are cleanly separated from algorithm-specific logic, providing a foundation intended to support acceleration of additional sequence analysis algorithms without redundant low-level engineering.

Version published to 10.64898/2026.06.12.731910 on bioRxiv
Jun 16, 2026

Rapid-PFP: Accelerating Prefix-Free Parsing with GPU Parallelism

This article has 5 authors:
1. Eddie Ferro
2. Tyler Pencinger
3. Oded Green
4. Mahsa Lotfollahi
5. Christina Boucher
This article has no evaluationsLatest version May 1, 2026
Revisiting CPUs for Protein Folding: Xeon-Based Acceleration of AlphaFold2

This article has 10 authors:
1. Narendra Chaudhary
2. Wei Yang
3. Dhiraj Kalamkar
4. Jianqian Zhou
5. Soumyadip Ghosh
6. Lei Xia
7. Manasi Tiwari
8. Alexander Heinecke
9. Bharat Kaul
10. Sanchit Misra
This article has no evaluationsLatest version May 29, 2026
CountESS: a flexible, graphical pipeline tool for deep mutational scanning analysis

This article has 6 authors:
1. Nick Moore
2. Callum J Sargeant
3. Matthew J Wakefield
4. Nicholas A Popp
5. Douglas M Fowler
6. Alan F Rubin
This article has no evaluationsLatest version Apr 30, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Rapid-PFP: Accelerating Prefix-Free Parsing with GPU Parallelism

Revisiting CPUs for Protein Folding: Xeon-Based Acceleration of AlphaFold2

CountESS: a flexible, graphical pipeline tool for deep mutational scanning analysis