PETScan: Score-Based Genome-Wide Association Analysis of RNA-Seq and ATAC-Seq Data

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

High-dimensional sequencing data, such as RNA-Seq for gene expression and ATAC-Seq for chromatin accessibility, are widely used in studying systems biology. Accessible chromatin allows transcription factors and regulatory elements to bind to DNA, thereby regulating transcription through the activation or repression of target genes. The association analysis of RNA-Seq and ATAC-Seq data provides insights into gene regulatory mechanisms. Most existing analytic tools exclusively focus on cis-associations, despite regulatory elements being able to physically interact with distant target genes. Furthermore, conventional approaches often utilize Pearson or Spearman correlations, which ignore the count-based nature of RNA-Seq data.

To address these limitations, we introduce PETScan , a computationally efficient genome-wide PE ak- T ranscript Sc ore-based association an alysis, utilizing negative binomial models to better accommodate RNA-Seq data. We leverage score tests and matrix calculations for improved computational efficiency, and combine an empirical permutation method with genomic control to ensure valid p-value calculations in studies with limited sample sizes. In real-world datasets, PETScan achieved three orders of magnitude faster than Wald tests, while identifying similar significant gene-peak pairs. The PETScan R package is available on GitHub at https://github.com/yajing-hao/PETScan .

Article activity feed