KCFtools: Rapid alignment-free method for introgression screening and GWAS using k-mer profiles
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Motivation
In the era of multiple genome references, researchers often align sequencing reads against distinct assemblies or even multiple references simultaneously. This enables applications such as the detection of introgressed segments or highly variable genomic regions, which are especially prevalent in large-genome crop species such as lettuce or wheat. However, these applications come at the cost of increased computational burden, inconsistencies in mapping methods, and reduced reproducibility across studies. To address these limitations, we developed KCFtools, a Java-based toolkit that identifies the presence and absence of k -mers in non-overlapping genomic or transcriptomic windows by comparing query and reference genomes. This alignment-free approach enables the efficient computation of an identity score for each window, thereby facilitating robust detection of introgressed or variable regions across genomes.
Results
We systematically evaluated the performance and accuracy of the k -mer-based method implemented in KCFtools, benchmarking it against conventional SNP-based introgression detection pipelines. Our results demonstrate that KCFtools effectively captures introgressed segments and structurally diverse regions, even in species with fragmented or highly divergent reference genomes. In addition, we extended KCFtools to generate genotype matrices from k -mer variation tables. These matrices are compatible with Genome-Wide Association Studies (GWAS) software and allow the identification of loci associated with phenotypic traits. We showcase the utility of this approach by detecting known and novel associations for downy mildew resistance in lettuce, underscoring the pipeline’s potential for high-resolution, reference-agnostic population genetic analysis.
Availability
https://github.com/sivasubramanics/kcftools
Contact
c.s.sivasubramani@gmail.com