A Novel Efficient Algorithm for Common Variants Genotyping from Low-Coverage Sequencing Data

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Low-coverage whole-genome sequencing (LC-WGS) combined with imputation represents a cost-effective genotyping strategy for genome-wide association studies (GWAS) in population genetics. In this study, the Limpute algorithm was developed specifically for genotyping from low-coverage sequencing data, it extracts variant information from low-coverage sequencing data by the novel virtual probes and subsequently performs imputation through cross-reference between samples. Compared to the currently dominant algorithm for low-coverage sequencing data, GLIMPSE2, Limpute achieved similar imputation performance within common variants (r 2 >0.87) while the GLIMPSE2 has a runtime approximately five times longer than that of the Limpute. Furthermore, to fully evaluate the accuracy of genotype imputation by Limpute, we utilized high-coverage whole-genome sequencing data (30x), microarray data, and high-coverage whole-exome sequencing data (30x) as validation sets respectively. The results demonstrated that Limpute has a good imputation performance for common variants using low-coverage sequencing data (1x: r 2 > 0.87; 3x: r 2 > 0.92; 5x: r 2 > 0.93). In summary, we present a highly efficient, low-cost algorithm for genotyping from low-coverage sequencing data, offering substantial support for genetic research.

Article activity feed