Genome-wide analysis of tandem repeat variation identifies SLC15A4 as a susceptibility gene for idiopathic pulmonary fibrosis
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Idiopathic pulmonary fibrosis (IPF) is a progressive and fatal interstitial lung disease with a strong genetic component, yet a substantial proportion of its genetic risk remains unexplained by single-nucleotide variant (SNV) association studies. In this study, we conduct a genome-wide association study of tandem repeats (TRs), which are a pervasive and understudied class of genetic variation, to identify novel loci influencing IPF susceptibility. Using whole-genome sequencing data from two case-control datasets (discovery: 507 cases and 174 controls; replication: 1243 cases and 3197 controls), we identify and replicate associations at four TR loci, most notably a complex repeat within intron 2 of SLC15A4, a gene encoding the endolysosomal peptide-histidine transporter 1 (PHT1). This TR is within a predicted enhancer and acts as an expression and splicing quantitative trait locus (eQTL/sQTL), influencing the expression of an alternative SLC15A4 transcript in biologically-relevant tissues and cells, with longer TR alleles correlating with decreased expression of this alternative transcript and increased IPF risk. In a biologically-relevant cell-line model, we show that this alternative transcript is targeted by nonsense-mediated mRNA decay, and levels of the alternative transcript are inversely correlated with levels of the canonical full-length SLC15A4 transcript. Our findings implicate SLC15A4 in IPF pathogenesis, possibly via modulation of innate immune responses to injury, and demonstrate the power of TR-focused genomic analyses to reveal previously undetectable disease mechanisms and therapeutic targets.