Exploration of biomarkers for predicting the prognosis of patients with diffuse large B-cell lymphoma by machine-learning analysis

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: Diffuse large B cell lymphoma (DLBCL), one distinct origin of hematological malignancies, has caused a major public health problem. However, the molecular mechanisms was not been clearly elucidated. The aim is to explore disease-specific diagnostic biomarkers and mechanisms to improve this situation. Methods: Three microarray datasets (GSE25638, GSE12195, GSE12453) were downloaded from the Gene Expression Omnibus (GEO) database. The key genes in DLBCL patients were screened by differential expression genes (DEGs) analysis and weighted gene co-expression network analysis (WGCNA). Functional enrichment analysis and protein-protein interaction (PPI) network construction were employed to reveal DLBCL-related pathogenic molecules and underlying mechanisms. Random forest analysis was adopted for screening candidate biomarkers, and Kaplan Meier survival analysis were constructed to predict the risk of patients. The single‐sample gene set enrichment analysis was used to explore immune cell infiltration in lymphoma. Validation of the hub genes expression was confirmed by RT-PCR and immunohistochemistry (IHC) tests. Results: 95 key genes were obtained from three datasets about DLBCL patients by DEGs and WGCNA. The four hub genes (CXCL9, CCL18, C1QA, CTSC) were screened by random forest analysis and machine learning algorithm. The ROC results showed that the AUC was 1.00 in the training set, and the bootstrap verification was performed for 1000 times in the external validation set, and the AUC size was 0.839. The several pathways were found by Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) analysis. The four hub genes were identified to be excellent potential for the survival of DLBCL patients. Dysregulated immune cell infiltrations were observed in DLBCL, as well as positive correlations with the four hub genes, respectively. Validation of the hub genes with high expressions was also demonstrated in DLBCL patients. Conclusion: This study identified four candidate hub genes (CXCL9, CCL18, C1QA, CTSC) that could predict the risk of DLBCL, and CXCL9 may be essential in developing the disease, which provided a new perspective for the molecular mechanism and therapeutic targets for DLBCL.

Article activity feed