SpecImmune accurately genotypes diverse immune-related gene families using long-read data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Polymorphic immune-related genes (HLA, KIR, IG, TCR, and CYP) exhibit significant complexity due to their extensive heterozygosity and inter-loci homology, necessitating specific methods for accurate characterization. We present SpecImmune, the first comprehensive tool leveraging long-read sequencing data to resolve the full spectrum of these immune-related genes. The method adopts an iterative graph-based algorithm for haplotype reconstruction. We validated SpecImmune across 1,019 samples from the 1kGP ONT cohort, 42 PacBio CLR and 9 PacBio HiFi samples from the HGSVC project, and 47 PacBio HiFi and 37 ONT samples from the HPRC project. SpecImmune achieved an accuracy of 98% in HLA typing, which represents a 12% improvement over both SpecHLA and HLA*LA. SpecImmune is the initial method to type multiple CYP loci, as well as the foremost approach to allow precise KIR and germline IG/TCR typing using long reads. Comprehensive genotyping of these loci by SpecImmune unveils a new observation of substantial linkage disequilibrium among HLA, KIR, and CYP loci. The proteins derived from these loci exhibit strong binding affinities, which suggest the origin of the marked linkage disequilibrium. Further, SpecImmune unveils a novel finding of elevated IG/TCR heterozygosity in African populations. Additionally, SpecImmune facilitates the detection of de novo mutations and enables allele-specific drug recommendations.