Integration of GWAS-Derived Polygenic Risk Scores with Single-Cell RNA Sequencing to Identify Cell-Type–Specific Genetic Risk in Type 2 Diabetes.

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Type 2 Diabetes (T2D) is a complicated disease, influenced by both genetic factors and cellular dysfunction inside the pancreatic tissue. Over the years, researchers have identified many genetic variants associated with T2D risk through Genome Wide Association Studies (GWAS), but it’s still tough to determine exactly how these variants affect different cell types. This study introduces a framework that integrates GWAS derived Polygenic Risk Scores (PRS), single-cell RNA sequencing (scRNA-seq) data, and graph based machine learning to explore how genetic risk unfolds at the cell type level in T2D. We started by filtering GWAS summary statistics, which cover about 29.7 million variants, to focus on high confidence variants that map to chromosome 22 variants subsequently mapped to specific genes [1]. With single-cell RNA sequencing, we identified several types of pancreatic cells: alpha, beta, delta, acinar, ductal, fibroblast, endothelial, macrophage, and perivascular cells [2]. When we looked at which genes were active in these cells, we found a lot of variation, especially in alpha cells, macrophages, and fibroblasts, which had the most genes showing large differences [3]. By connecting GWAS genes with single-cell gene expression data, we identified 161 genes that overlapped between the two[4]. We then calculated PRS scores for each gene and combined them with gene expression results to identify which cell types carried greater genetic risk. Interestingly, immune and stromal cells, especially macrophages and fibroblasts, showed higher risk scores than the traditional focus on beta cells. To dig deeper, we used a Graph Neural Network to examine protein interactions, which highlighted key genes in the network, including SOX10, SHANK3, NCF4, and OSM[5]. When we checked which biological functions were most involved, immune responses and cytokine signalling pathways came up over and over again[6]. Altogether, by merging genetic data, single-cell RNA profiles, and network based machine learning, we get a clearer picture of how T2D unfolds at the cellular level. This approach also reveals that immune and stromal cells might play a bigger part in the disease than previously thought.

Article activity feed