Decoding the GPRC5A Paradox in Pancreatic Ductal Adenocarcinoma:A Subtype-Stratified, Treatment-Deconfounded, Multi-Omic Investigation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background: Pancreatic ductal adenocarcinoma (PDAC) carries a five-year survival rate below 10%, underscoring the urgent need for mechanistically grounded prognostic biomarkers. A prior batch-harmonized machine learning framework identified GPRC5A as prognostically relevant in PDAC but found reduced expression in deceased patients, the opposite of its established oncogenic role, generating an unexplained paradox. Objectives: To resolve the GPRC5A paradox through five interrelated analyses: molecular subtype stratification, gemcitabine treatment deconfounding, RNA–protein concordance assessment, somatic mutation mapping, and machine learning role-state classification. Methods: TCGA-PAAD (n=177) provided RNA-seq, clinical, and somatic mutation data; CPTAC-PAAD (n=140) provided matched proteomics. Molecular subtypes were assigned using the Moffitt 2015 single-sample classifier. Cox proportional hazards models and multivariable adjustment were used for survival analyses. GPRC5A RNA–protein concordance was quantified by Spearman correlation and benchmarked against 4,491 genome-wide gene pairs. Somatic mutations were mapped onto an AlphaFold2-predicted GPRC5A structure. A leakage-free Random Forest, XGBoost, and logistic regression pipeline was trained to predict GPRC5A functional role state (oncogenic vs. suppressive) from subtype and co-expression features. Results: In the classical subtype (n=100), high GPRC5A expression associated with significantly worse survival (log-rank p=0.00024; HR=1.53, 95% CI 1.17-2.00). In the basal-like subtype (n=77), high expression paradoxically associated with modestly better survival (log-rank p=0.022; HR=1.26, 95% CI 1.06-1.50 by continuous Cox model; see Discussion for reconciliation of KM and Cox directionality). GPRC5A remained a significant independent predictor across all multivariable models (fully adjusted HR=1.44, 95% CI 1.23-1.68, p=3.89×10⁻⁶). RNA-protein correlation was moderate (Spearman r=0.571, 84.6th genome-wide percentile), arguing against post-transcriptional repression. No somatic mutations were detected in GPRC5A. The Random Forest role-state classifier achieved a held-out test AUC of 0.833 (LOOCV AUC=0.758), with classical co-expression features dominating over GPRC5A expression itself. Conclusions: The GPRC5A paradox is primarily explained by molecular subtype mixing, with gemcitabine-induced transcriptional confounding as a secondary contributor. Post-transcriptional regulation and somatic mutation are not major drivers. GPRC5A should be evaluated within, not across, molecular subtypes, and its absence of somatic mutations directs mechanistic inquiry toward epigenomic regulation. A machine learning classifier assigns GPRC5A functional role state from transcriptomic context with reasonable accuracy, providing a proof-of-concept tool for subtype-aware prognostic stratification in PDAC.