To determine the analytical validity of SNP-chips for genotyping very rare genetic variants.
Retrospective study using data from two publicly available resources, the UK Biobank and the Personal Genome Project.
Research biobanks and direct-to-consumer genetic testing in the UK and USA.
49,908 individuals recruited to UK Biobank, and 21 individuals who purchased consumer genetic tests and shared their data online via the Personal Genomes Project.
Main outcome measures
We assessed the analytical validity of genotypes from SNP-chips (index test) with sequencing data (reference standard). We evaluated the genotyping accuracy of the SNP-chips and split the results by variant frequency. We went on to select rare pathogenic variants in the BRCA1 and BRCA2 genes as an exemplar for detailed analysis of clinically-actionable variants in UK Biobank, and assessed BRCA-related cancers (breast, ovarian, prostate and pancreatic) in participants using cancer registry data.
SNP-chip genotype accuracy is high overall; sensitivity, specificity and precision are all >99% for 108,574 common variants directly genotyped by the UK Biobank SNP-chips. However, the likelihood of a true positive result reduces dramatically with decreasing variant frequency; for variants with a frequency <0.001% in UK Biobank the precision is very low and only 16% of 4,711 variants from the SNP-chips confirm with sequencing data. Results are similar for SNP-chip data from the Personal Genomes Project, and 20/21 individuals have at least one rare pathogenic variant that has been incorrectly genotyped. For pathogenic variants in the BRCA1 and BRCA2 genes, the overall performance metrics of the SNP-chips in UK Biobank are sensitivity 34.6%, specificity 98.3% and precision 4.2%. Rates of BRCA-related cancers in individuals in UK Biobank with a positive SNP-chip result are similar to age-matched controls (OR 1.28, P=0.07, 95% CI: 0.98 to 1.67), while sequence-positive individuals have a significantly increased risk (OR 3.73, P=3.5×10 −12 , 95% CI: 2.57 to 5.40).
SNP-chips are extremely unreliable for genotyping very rare pathogenic variants and should not be used to guide health decisions without validation.
Section 1: What is already known on this topic
SNP-chips are an accurate and affordable method for genotyping common genetic variants across the genome. They are often used by direct-to-consumer (DTC) genetic testing companies and research studies, but there several case reports suggesting they perform poorly for genotyping rare genetic variants when compared with sequencing.
Section 2: What this study adds
Our study confirms that SNP-chips are highly inaccurate for genotyping rare, clinically-actionable variants. Using large-scale SNP-chip and sequencing data from UK Biobank, we show that SNP-chips have a very low precision of <16% for detecting very rare variants (i.e. the majority of variants with population frequency of <0.001% are false positives). We observed a similar performance in a small sample of raw SNP-chip data from DTC genetic tests. Very rare variants assayed using SNP-chips should not be used to guide health decisions without validation.