Y-mer: A k-mer based method for determining human Y chromosome haplogroups from ultra-low sequencing depth data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Determining genetic ancestry of an individual is challenging from poorly preserved or mixed samples that permit only ultra-low sequence (ulcWGS) depth < 0.1x to be gained at target loci. Leveraging the recent advances in telomere-to-telomere sequencing of the whole genomes with long reads we show first in a simplified example how short DNA string (k-mer) copy numbers at two different types of repeat arrays correlate with basal chromosome Y (chrY) haplogroups (HG-s). We develop a new k-mer based method Y- mer and show how information from hundreds of thousands of k-mers in distance-based models enables accurate inference of chrY haplogroup from WGS sequence at depth less than 0.01x without additional PCR or capture. We test the performance of Y-mer on ancient DNA and prenatal screening data showing its potential for genetic ancestry inference for cell free, forensic and ancient DNA research from short read WGS data.