Machine Learning-Driven Discovery of Biomarkers in Gastric Cancer: A Focus on DPT, FBP2, ADH7, INHBA, and GPR155
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Gastric cancer still is a severe threat to human health, often presenting with a poor prognosis, effective biomarkers for early detection and targeted treatment are urgently needed. This study performed a comprehensive bioinformatics and machine learning approach to identify key protein biomarkers for gastric cancer and elucidate their potential functions. Gastric cancer-related datasets were obtained from the NCBI Gene Expression Omnibus database. Differential expression analysis identified 171 genes with noticeable differences between control and tumor samples. Utilizing LASSO, SVM-RFE, and RF algorithms, five genes—DPT, FBP2, ADH7, INHBA, and GPR155—were identified as potential biomarkers. A support vector machine (SVM) model demonstrated the highest performance among ten machine learning models constructed using these five genes. Shapley additive explanations (SHAP) were employed to illustrate the detailed contribution of the pivotal genes to the SVM model. Gene set enrichment analysis and gene set variation analysis were then used to find out the functional roles of these genes in gastric cancer cells. At length, we revealed the distinctive effects of signature genes on immune cell infiltration and patient prognosis. In conclusion, the identified proteins have the potential to serve as diagnostic biomarkers and provide prognostic value for gastric cancer. This study offers a comprehensive, data-driven approach to uncover critical molecular targets for improved detection and management of this deadly disease.