Causal splicing variants revealed by deep-learning integration of single-cell sQTL mapping under influenza infection
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Fulfilling the promise of human genetics in elucidating disease requires identifying causal variants and genes underlying genetic association signals. Molecular quantitative trait locus (molQTL) analyses, e.g. expression QTL (eQTL) and splicing QTL (sQTL), link genetic variants to intermediate molecular phenotypes, but pinpointing causal variants and their regulatory effects remains challenging. Here, we integrate sQTL analysis with deep-learning-based splicing effect annotation to identify causal genetic variants and elucidate their functional mechanisms affecting human phenotypes. Results Using a single-cell GWAS method (scHi-HOST) on 96 lymphoblastoid cell lines (LCLs) with and without influenza A virus (IAV) infection, we discovered ~ 43,000 sQTLs associated with 217 genes after IAV infection. Integrating sQTLs with AI splice prediction, we uncovered 76 likely causal variants that affect cis-acting molecular splicing components (5’ donor, 3’ acceptor), supported by further computational analysis. Among these, we experimentally validated a causal sQTL signal affecting poly (ADP-ribose) polymerase 2 (PARP2). The causal variant, rs2297616, alters the 5’ splice donor site in the second intron of PARP2 , resulting in two protein isoforms differing by 13 amino acids. The derived A allele was associated with the longer protein isoform and increased IAV levels in LCLs. CRISPR editing validated the causal effect of this variant on both protein length and IAV infection. Lastly, these 76 putative causal sQTLs were further linked to over a hundred GWAS traits, including many variants associated with autoimmune diseases. Conclusions Our work provides a catalog of causal sQTL with direct splicing impacts, providing causal mechanistic insights from genotype to disease susceptibility.