Identifying, understanding, and correcting technical artifacts on the sex chromosomes in next-generation sequencing data
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (GigaScience)
Abstract
Background
Mammalian X and Y chromosomes share a common evolutionary origin and retain regions of high sequence similarity. Similar sequence content can confound the mapping of short next-generation sequencing reads to a reference genome. It is therefore possible that the presence of both sex chromosomes in a reference genome can cause technical artifacts in genomic data and affect downstream analyses and applications. Understanding this problem is critical for medical genomics and population genomic inference.
Results
Here, we characterize how sequence homology can affect analyses on the sex chromosomes and present XYalign, a new tool that (1) facilitates the inference of sex chromosome complement from next-generation sequencing data; (2) corrects erroneous read mapping on the sex chromosomes; and (3) tabulates and visualizes important metrics for quality control such as mapping quality, sequencing depth, and allele balance. We find that sequence homology affects read mapping on the sex chromosomes and this has downstream effects on variant calling. However, we show that XYalign can correct mismapping, resulting in more accurate variant calling. We also show how metrics output by XYalign can be used to identify XX and XY individuals across diverse sequencing experiments, including low- and high-coverage whole-genome sequencing, and exome sequencing. Finally, we discuss how the flexibility of the XYalign framework can be leveraged for other uses including the identification of aneuploidy on the autosomes. XYalign is available open source under the GNU General Public License (version 3).
Conclusions
Sex chromsome sequence homology causes the mismapping of short reads, which in turn affects downstream analyses. XYalign provides a reproducible framework to correct mismapping and improve variant calling on the sex chromsomes.
Article activity feed
-
Now published in GigaScience doi: 10.1093/gigascience/giz074
Timothy H. Webster 1School of Life Sciences, Arizona State UniversityFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Timothy H. WebsterMadeline Couse 2Child and Family Research Institute, University of British ColumbiaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBruno M. Grande 3Department of Molecular Biology and Biochemistry, Simon Fraser UniversityFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Bruno M. GrandeEric Karlins 4Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of HealthFind this author on Google ScholarFind this author on PubMedSearch for this …
Now published in GigaScience doi: 10.1093/gigascience/giz074
Timothy H. Webster 1School of Life Sciences, Arizona State UniversityFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Timothy H. WebsterMadeline Couse 2Child and Family Research Institute, University of British ColumbiaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBruno M. Grande 3Department of Molecular Biology and Biochemistry, Simon Fraser UniversityFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Bruno M. GrandeEric Karlins 4Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of HealthFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteTanya N. Phung 5Interdepartmental Program in Bioinformatics, UCLAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Tanya N. PhungPhillip A. Richmond 6Centre for Molecular Medicine and Therapeutics, University of British Columbia7BC Children’s HospitalFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Phillip A. RichmondWhitney Whitford 8School of Biological Sciences, The University of AucklandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Whitney WhitfordMelissa A. Wilson Sayres 1School of Life Sciences, Arizona State University9Center for Evolution and Medicine, Arizona State UniversityFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Melissa A. Wilson Sayres
A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz074 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.
These peer reviews were as follows:
Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101812 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101813
-
-
-