Large-scale composite hypothesis testing for omics analyses

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Composite Hypothesis Testing (CHT) based on summary statistics has become a popular strategy to assess the effect of a same marker (or gene) jointly across multiple traits or at different omics levels. Although significant efforts have been made to develop efficient CHT procedures, most approaches face scalability constraints in terms of the number of traits/omics and markers to handle, or fail to account for potential correlations across traits efficiently. Methods relying on mixture models partially circumvent these limitations, but do not provide proper p-values, hampering the use of classical multiple testing procedures, graphical representations (e.g. Manhattan or QQ plots) and limiting their comparison with alternative approaches.

We introduce the qch_copula approach that combines the mixture model approach with a copula function to account for dependencies across traits/omics. The method comes with a pvalue that is consistently defined for any composite hypothesis to be tested. By significantly reducing the memory size burden of the EM algorithm during inference, the method scales to the analysis of several (up to 20) traits and 10 5 10 6 markers. We conducted a comprehensive benchmark study, comparing our approach with 6 state-of-the-art methods recently developed. The qch_copula procedure efficiently controls Type I error rate and yields substantial gain in detecting various patterns of joint associations. The interest of the method is further illustrated by the joint analysis of 14 association studies to detect pleiotropic regions involved in psychiatric disorders.

The proposed method is implemented in the R package qch , available on CRAN.

Article activity feed