FASTQuick: rapid and comprehensive quality assessment of raw sequence reads

This article has been Reviewed by the following groups

Read the full article

Abstract

Background

Rapid and thorough quality assessment of sequenced genomes on an ultra-high-throughput scale is crucial for successful large-scale genomic studies. Comprehensive quality assessment typically requires full genome alignment, which costs a substantial amount of computational resources and turnaround time. Existing tools are either computationally expensive owing to full alignment or lacking essential quality metrics by skipping read alignment.

Findings

We developed a set of rapid and accurate methods to produce comprehensive quality metrics directly from a subset of raw sequence reads (from whole-genome or whole-exome sequencing) without full alignment. Our methods offer orders of magnitude faster turnaround time than existing full alignment–based methods while providing comprehensive and sophisticated quality metrics, including estimates of genetic ancestry and cross-sample contamination.

Conclusions

By rapidly and comprehensively performing the quality assessment, our tool will help investigators detect potential issues in ultra-high-throughput sequence reads in real time within a low computational cost at the early stages of the analyses, ensuring high-quality downstream results and preventing unexpected loss in time, money, and invaluable specimens.

Article activity feed

  1. Now published in GigaScience doi: 10.1093/gigascience/giab004

    Fan Zhang 1Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Fan ZhangFor correspondence: fanzhang@umich.eduHyun Min Kang 2Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

    A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giab004 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

    These peer reviews were as follows:

    Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102627 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102628