On the analysis of genetic association with long-read sequencing data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Long-read sequencing (LRS) technologies have enhanced the ability to resolve complex genomic architecture and determine the 'phase' relationships of genetic variants over long distances. Although genome-wide association studies (GWAS) identify individual variants associated with complex traits, they do not typically account for whether multiple associated signals at a locus may act in cis or trans, or whether they reflect allelic heterogeneity. As a result, effects that arise specifically from phase relationships may remain hidden in analyses using short-read and microarray data. While the advent of LRS has enabled accurate measurement of phase in population cohorts, statistical methods that leverage phase in genetic association analysis remain underdeveloped. Here, we introduce the Regression on Phase (RoP) method, which directly models cis and trans phase effects between variants under a regression framework. In simulations, RoP outperforms genotype interaction tests that detect phase effects indirectly, and distinguishes in-cis from in-trans phase effects. We implemented RoP at two cystic fibrosis (CF) modifier loci discovered by GWAS. At the chromosome 7q35 trypsinogen locus, RoP confirmed that two variants contributed independently (allelic heterogeneity). At the SLC6A14 locus on chromosome X, phase analysis uncovered a coordinated regulatory mechanism in which a promoter variant modulates lung phenotypes in individuals with CF when acting in cis with a lung-specific enhancer (E2765449/enhD); this mechanism was confirmed in functional studies. These findings highlight the potential of leveraging phase information from LRS in genetic association studies. Analyzing phase effects with RoP can provide deeper insights into the complex genetic architectures underlying disease phenotypes, ultimately guiding more informed functional investigations and potentially revealing new therapeutic targets.