HaploCharmer: a Snakemake workflow for read-scale haplotype calling adapted to polyploids

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The advent of next-generation sequencing (NGS) has revolutionized the study of single nucleotide polymorphisms (SNPs), making it increasingly cost-effective. Haplotypes, which combine alleles from adjacent variants, offer several advantages over bi-allelic SNPs, including enhanced information content, reduced dimensionality, and improved statistical power in genomic studies. These benefits are particularly significant for polyploid species, where distinguishing all homologous copies using SNP markers alone can be challenging. This article introduces HaploCharmer, a flexible workflow designed for read-scale haplotype calling from NGS data. HaploCharmer identifies haplotypes within preconfigured genomic regions smaller than a sequencing read, ensuring direct comparability across individuals. It integrates a series of processing steps including mapping, haplotype identification, filtration, and reporting of haplotype sequences, as presence-absence, in the panel of accessions analyzed. The performance of HaploCharmer was validated using whole-genome sequencing data from a highly polyploid sugarcane cultivar (R570) and its self-progeny. The workflow successfully identified a large number of high-quality haplotypes, with less than 1% of false positives. Single-dose haplotypes were used to construct a genetic map that accurately included known chromosomal rearrangements in the R570 cultivar, demonstrating its effectiveness in studying large chromosome structural variations. HaploCharmer provides a robust method for diversity, genetic mapping, and quantitative genetics studies in both diploid and polyploid species.

Article activity feed