Integration of GWAS-Derived Polygenic Risk Scores with Single-Cell RNA Sequencing to Identify Cell-Type–Specific Genetic Risk in Type 2 Diabetes.

Rupanjali Singh

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Type 2 Diabetes (T2D) is a complicated disease, influenced by both genetic factors and cellular dysfunction inside the pancreatic tissue. Over the years, researchers have identified many genetic variants associated with T2D risk through Genome Wide Association Studies (GWAS), but it’s still tough to determine exactly how these variants affect different cell types. This study introduces a framework that integrates GWAS derived Polygenic Risk Scores (PRS), single-cell RNA sequencing (scRNA-seq) data, and graph based machine learning to explore how genetic risk unfolds at the cell type level in T2D. We started by filtering GWAS summary statistics, which cover about 29.7 million variants, to focus on high confidence variants that map to chromosome 22 variants subsequently mapped to specific genes [1]. With single-cell RNA sequencing, we identified several types of pancreatic cells: alpha, beta, delta, acinar, ductal, fibroblast, endothelial, macrophage, and perivascular cells [2]. When we looked at which genes were active in these cells, we found a lot of variation, especially in alpha cells, macrophages, and fibroblasts, which had the most genes showing large differences [3]. By connecting GWAS genes with single-cell gene expression data, we identified 161 genes that overlapped between the two[4]. We then calculated PRS scores for each gene and combined them with gene expression results to identify which cell types carried greater genetic risk. Interestingly, immune and stromal cells, especially macrophages and fibroblasts, showed higher risk scores than the traditional focus on beta cells. To dig deeper, we used a Graph Neural Network to examine protein interactions, which highlighted key genes in the network, including SOX10, SHANK3, NCF4, and OSM[5]. When we checked which biological functions were most involved, immune responses and cytokine signalling pathways came up over and over again[6]. Altogether, by merging genetic data, single-cell RNA profiles, and network based machine learning, we get a clearer picture of how T2D unfolds at the cellular level. This approach also reveals that immune and stromal cells might play a bigger part in the disease than previously thought.

Version published to 10.21203/rs.3.rs-9396739/v1 on Research Square
Apr 14, 2026

Connecting polygenic disease risk to cell states and regulatory programs through single-cell chromatin accessibility

This article has 5 authors:
1. Liyang Yu
2. Luke T. Deary
3. Qiaoxue Liu
4. Qirui Zhang
5. Siming Zhao
This article has no evaluationsLatest version Apr 28, 2026
Gene-Environment Interaction through single-cell transcriptomics in Type 2 diabetes

This article has 1 author:
1. Rupanjali Singh
This article has no evaluationsLatest version Apr 2, 2026
Integrating network annotation from multiple correlated traits to improve polygenic risk scores based on GWAS summary statistics

This article has 4 authors:
1. Qiuying Sha
2. Lirong Zhu
3. Xuewei Cao
4. Shuanglin Zhang
This article has no evaluationsLatest version Apr 13, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Connecting polygenic disease risk to cell states and regulatory programs through single-cell chromatin accessibility

Gene-Environment Interaction through single-cell transcriptomics in Type 2 diabetes

Integrating network annotation from multiple correlated traits to improve polygenic risk scores based on GWAS summary statistics