Scan Statistics for Nonhomogeneous Poisson Processes with Extreme-Value Calibration and Application to CNV Detection

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

We develop a scan statistic method for detecting local clusters in a two-sample nonhomogeneous Poisson process (NHPP) framework, motivated by copy number variation (CNV) analysis in next-generation sequencing data. The control sample is used to construct an empirical time transformation, under which the transformed case sample is approximately uniform on [0,1] under the null hypothesis. The scan statistic is defined as the maximum number of transformed points within a moving window. We show that the scan statistic converges to a generalized extreme value (GEV) distribution with an extremal index that captures the dependence induced by overlapping windows. The GEV parameters and extremal index are estimated using maximum likelihood and exceedance clustering methods, providing an asymptotic calibration of the test. A permutation procedure is also developed to provide a nonparametric alternative. Simulation studies demonstrate that the proposed methods maintain accurate type-I error and perform well compared with the competing continuous testing method under heterogeneous baseline intensities. An application to sequencing data illustrates the effectiveness of the proposed approach for detecting CNV regions.

Article activity feed