The human immunoglobulin heavy chain constant gene locus is enriched for large complex structural variants and coding polymorphisms that vary in frequency among human populations
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The immunoglobulin heavy chain constant (IGHC) domain of antibodies (Ab) is responsible for a variety of effector functions critical to Ab mediated immunity. In human, this domain is encoded by genes within the IGHC locus, in which descriptions of genomic diversity remain incomplete. To address this, we utilized long-read genomic datasets to build a high-quality IGHC haplotype/variant catalog from 105 individuals of diverse ancestry. This included the initial construction of highly vetted haplotype-resolved assemblies for 8 individuals, from which we tested and optimized a high-throughput approach for targeted long-read sequencing and assembly of the IGHC locus. This approach was able to generate accurate, locally phased assemblies and genotype callsets, and, applied at scale, facilitated discovery of novel single nucleotide and complex structural variants (SNVs; SVs), and previously uncharacterized genes and alleles. In total, we identified 262 coding alleles, 235 of which were undocumented. Comparisons of SNV, SV, and gene allele/genotype frequencies revealed significant population differentiation, highlighting potential sites of past natural selection and/or genetic drift. Together, our results illuminate missing signatures of haplotype diversity in the IGHC locus, and establish a new foundation for cataloguing IGHC germline variation and addressing its role in Ab function and disease.