Benchmarking sequence performance on the DNBSEQ-T7 using Genome in a Bottle reference genomes

Ansia van Coller
Setshaba Taukobong
Maano Malima
Samira Ghoor
Nganea Nangammbi
Enrico Roode
Martin Naicker
Victoria Cole
Brigitte Glanzmann
Craig Kinnear
Nadia Carstens

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Advances in sequencing technologies have improved the accuracy, throughput, and completeness of human genome characterization, enabling more reliable detection of genetic variation. Well-characterized reference genomes are critical for benchmarking sequencing platforms and bioinformatics analysis pipelines. Here, we present whole genome sequencing datasets generated for the Ashkenazi Jewish trio reference samples from the Genome in a Bottle Consortium. Libraries were prepared using three distinct MGI-based workflows: PCR-free library preparation, FastFS DNA library preparation, and Universal DNA library preparation. Sequencing was performed on the MGI DNBSEQ-T7 platform, generating a minimum of 400 million paired-end reads per sample, corresponding to 30X mean genome coverage.

Raw reads were processed using a standardized GATK bioinformatics workflow. Sequencing performance and variant detection accuracy were evaluated using the Genome in a Bottle high-confidence benchmark variant sets. All workflows demonstrated high sequencing quality and concordance with GIAB benchmark truth sets, with PCR-free libraries showing the strongest indel calling performance and lowest Mendelian violation rates across the Ashkenazi trio.

This dataset provides a resource for benchmarking DNBSEQ-T7 sequencing and bioinformatics workflows, and for evaluating the impact of library preparation strategies on whole genome variant detection performance.

Version published to 10.64898/2026.05.22.727100 on bioRxiv
May 26, 2026

Integrated optimization of experimental and computational workflows improves genome recovery in long-read gut metagenomics

This article has 16 authors:
1. Yongjie Hu
2. Liuyong Sun
3. Ye Huang
4. Fangfang Jiang
5. Xin Tong
6. Juan Yang
7. Yanmei Ju
8. Zejun Yang
9. Shu Liufu
10. Yangzi Hu
11. Wenbing Ma
12. Ruijin Guo
13. Wangsheng Li
14. Tao Zhang
15. Xiaolong Zhu
16. Zhe Zhang
This article has no evaluationsLatest version May 26, 2026
Systematic benchmarking of small variant calling pipelines for long-read RNA sequencing data

This article has 2 authors:
1. Jiayi Wang
2. Mark D. Robinson
This article has no evaluationsLatest version May 2, 2026
Benchmarking full-length ITS metabarcoding across Illumina 2x500, PacBio, and Oxford Nanopore sequencing using mock and soil communities

This article has 7 authors:
1. Leho Tedersoo
2. Marko Prous
3. Meirong Chen
4. Sten Anslan
5. Irja Saar
6. Benjamin Dubois
7. Vladimir Mikryukov
This article has no evaluationsLatest version May 21, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Integrated optimization of experimental and computational workflows improves genome recovery in long-read gut metagenomics

Systematic benchmarking of small variant calling pipelines for long-read RNA sequencing data

Benchmarking full-length ITS metabarcoding across Illumina 2x500, PacBio, and Oxford Nanopore sequencing using mock and soil communities