Integrating enriched case data from national laboratory testing with population-based case-control analyses: a novel statistical likelihood-ratio methodology for PS4 applied to 325,345 breast cancer cases and 671,006 controls
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
For many evidence criteria within v3.0 of the ACMG/AMP guidelines, methodologies have been developed to empower their use outside the stipulated evidence strengths. However, no such methodology has been established for case-control data (PS4). With the release of large-scale unselected case-control datasets and expansion of nationally-collected laboratory datasets enriched for pathogenic variant carriers, there is potential to combine datasets across ascertainment contexts in a more quantitative manner using novel likelihood ratio tools.
Methods
Using our published PS4-LR-Calculator, we calculated a combined log likelihood ratio (PS4-LLR) across five datasets (three unselected, and two enriched), and estimated enrichment of pathogenic variants in clinically-ascertained laboratory data using truncating variant prevalence.
Results
Data were combined for 10,817 missense variants from 325,345 female breast cancer patients and 671,006 controls of Western European ancestry for five breast cancer susceptibility genes ( BRCA1, BRCA2, PALB2, ATM, CHEK2 ). A combined LLR was produced for 4,690 missense variants; 927 variants received evidence towards pathogenicity (LLR≥ 1), and 3,242 received evidence towards benignity (LLR≤ -1).
Conclusion
This flexible, variant-level methodology combines nationally-collected ‘enriched’ datasets with unselected case-control cohorts, expanding the available information for case-control analysis, boosting power, enabling exploration of atypical penetrance and empowering variant classification.