Multiple-testing corrections in case-control studies using identity-by-descent segments
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Identity-by-descent (IBD) mapping provides complementary signals to genome-wide association studies (GWAS) when multiple causal haplotypes or variants are present, but not directly tested. However, failing to correct for multiple testing in case-control studies using IBD segments can lead to false discoveries. We propose the difference between case-case and control-control IBD rates as an IBD mapping statistic. For our hypothesis test, we use a computationally efficient approach from the stochastic processes literature to derive genome-wide significance levels that control the family-wise error rate (FWER). Whole genome simulations indicate that our method conservatively controls the FWER. Because positive selection can lead to false discoveries, we pair our IBD mapping approach with a selection scan so that one can contrast results for evidence of confounding due to recent sweeps or other mechanisms, like population structure, that increase IBD sharing. We developed automated and reproducible workflows to phase haplotypes, call local ancestry probabilities, and perform the IBD mapping scan, the former two tasks being important preprocessing steps for haplotype analyses. We applied our methods to search for Alzheimer’s disease (AD) risk loci in the Alzheimer’s Disease Sequencing Project (ADSP) genome data. We identified six genome-wide significant signals of AD risk among samples genetically similar to African and European reference populations and self-identified Amish samples. Variants in the six potential risk loci we detected have previously been associated with AD, dementia, and memory decline. Three genes at two potential risk loci have already been nominated as therapeutic targets. Overall, our scalable approach makes further use of large consortia resources, which are expensive to collect but provide insights into disease mechanisms.
Highlights
-
We propose a computationally efficient method to address multiple testing when scanning along the genome for differences in identity-by-descent rates of case-case and control-control pairs.
-
Whole genome simulations indicate that our method conservatively controls the desired family-wise error rate.
-
We performed three case-control scans from ancestry cohorts in the Alzheimer’s Disease Sequencing Project, detecting six genome-wide significant signals around potential risk loci.
-
We show that positive selection can confound IBD mapping tests in samples genetically similar to Europeans.