PCAGroupAdam: A PCA-Based Deep Learning Framework with Custom Optimization for Cancer Biomarker Discovery and Classification in High-Dimensional Gene Expression Data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
High-dimensional gene expression datasets present unique challenges in cancer biomarker discovery and classification. Here, we propose a novel deep learning framework incorporating principal component analysis (PCA) for dimensionality reduction and a custom optimizer, PCAGroupAdam, for effective gradient scaling. The framework was tested on multiple gene expression datasets, achieving superior classification performance compared to traditional optimizers (Adam, RMSprop, SGD). Key findings include the identification of biologically relevant genes such as AGR2, TSPAN8, and GAPDH, which were linked to cancer progression using SHAP analysis and validated through functional annotation (GO/KEGG) and STRING protein-protein interaction analysis and an unknown functioned lncRNA found to be correlated with breast cancer. Our approach demonstrates strong performance like high accuracy, f1 scores and significantly reduced loss values, interpretable results, and scalability to various high-dimensional omics datasets.