Leveraging Large-Scale Biobanks for Therapeutic Target Discovery
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Large biobanks, including the Million Veteran Program (MVP), the UK Biobank, and FinnGen, provide genetic association results for more than 1,000,000 individuals for hundreds of phenotypes. To select targets for pharmaceutical development, as well as to improve the understanding of existing targets, we harmonized these studies, and performed two-sample Mendelian Randomization (MR) on 2,003 phenotypes using genetic variants associated with gene expression (derived from GTEx and eQTLGen) and plasma protein levels (derived from ARIC, Fenland, and DeCODE) as proxies of target modulation. We found 69,669 gene-trait pairs with evidence (p ≤ 1.6 x 10 -9 ) for causal effects. From the selected gene-trait pairs, we observed 6,447 genes with strong causal evidence for at least one of 2,003 investigated traits. As expected, being identified as a gene-trait pair in our approach was significantly associated with higher odds of being an approved drug target and indication. We were able to rediscover 9% of approved drug targets in ChEMBL 34. Moreover, identified gene-traits were significantly associated with higher odds of being previously described as a gene-trait pair in OMIM, ClinVar, mouse knock-out data, and rare variant burden studies. To enhance the translational potential of the resource, we developed a predictive ranking model trained using approved drug targets described in ChEMBL 34 as well as several different biological annotations. This model was able to accurately predict the odds of a particular significant MR result being developed into an approved drug and its clinical indication (precision-recall AUC 0.79). We make our results publicly available in CIPHER .