Leveraging and partitioning polygenic risk scores to identify cancer-related proteins

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background

Large-scale genome-wide association studies (GWAS) have identified numerous common susceptibility variants associated with various cancers but underlying molecular mechanisms remain largely unknown.

Methods

Here we investigated the associations of susceptibility SNPs from 21 cancers with 4,955 plasma protein levels measured in cancer-free participants (N=8,664) from the Atherosclerosis Risk in Communities (ARIC) study. We used two complementary approaches, one based on analysis of associations of polygenic risk scores with the plasma proteome (pQTS) and the other based on a sparse canonical correlation analysis of the cancer-associated SNPs with the plasma proteome (ARCHIE), to detect potential mediating proteins and sub-networks.

Results

Across all cancers, we identified 90 associated proteins using pQTS of which 53 were distal ( trans )-associations between cancer related SNPs and proteins. ARCHIE identified 19 significantly associated protein networks encompassing a broader set of 433 proteins often including the proteins identified by pQTS. We found that the proteins identified by pQTS and/or ARCHIE were enriched for relevant biological processes and cell types as well as cancer drivers and have somatic evidence of being associated with the respective cancers. For example, using SNPs associated with risk of basal cell carcinoma, we identified two protein sets having distinct functions: one primarily enriched in immune and inflammatory responses while the other enriched in pigmentation. Additionally, we identified proteins associated with multiple related cancers indicating potential pleiotropic protein activity.

Conclusion

Our analysis leverages known GWAS associations for cancers to identify protein networks underlying cancer risk and accordingly partition polygenic risk scores into mechanistic components. As detailed molecular data of relevant tissues, cell-types and developmental stage become increasingly available, similar approaches will prove to be important for identifying downstream molecular targets for GWAS variants and improve interpretation and research application of polygenic risk scores.

Article activity feed