Indexcov: fast coverage quality control for whole-genome sequencing

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

The BAM 1 and CRAM 2 formats provide a supplementary linear index that facilitates rapid access to sequence alignments in arbitrary genomic regions. Comparing consecutive entries in a BAM or CRAM index allows one to infer the number of alignment records per genomic region for use as an effective proxy of sequence depth in each genomic region. Based on these properties, we have developed indexcov , an efficient estimator of whole-genome sequencing coverage to rapidly identify samples with aberrant coverage profiles, reveal large scale chromosomal anomalies, recognize potential batch effects, and infer the sex of a sample. Indexcov is available at: https://github.com/brentp/goleft under the MIT license.

Article activity feed

  1. Now published in GigaScience doi: 10.1093/gigascience/gix090

    Brent S. Pedersen 1Department of Human Genetics, University of Utah, Salt Lake City, UT3USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UTFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteRyan L. Collins 4Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA6Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA7Program in Bioinformatics and Integrative Genomics, Division of Medical Sciences, Harvard Medical School, Boston, MA.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteMichael E. Talkowski 4Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA5Department of Neurology, Harvard Medical School, Boston, MA6Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA7Program in Bioinformatics and Integrative Genomics, Division of Medical Sciences, Harvard Medical School, Boston, MA.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteAaron R. Quinlan 1Department of Human Genetics, University of Utah, Salt Lake City, UT2Department of Biomedical Informatics, University of Utah, Salt Lake City, UT3USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UTFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

    A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix090 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

    These peer reviews were as follows:

    Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100844 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100845