Machine-learning-based analysis of transcriptomics data for the identification of molecular signatures in cancer
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Early detection and treatment of head and neck squamous cell carcinoma (HNSC) and oral squamous cell carcinoma (OSCC) could decrease the existing high mortality rates due to these cancers. The need for biomarker identification is crucial to detect HNSC and OSCC in their initial stages enabling prompt treatment. The present study establishes a machine learning-based framework employing a two-step feature selection strategy consisting of analysis of variance and coupled support vector machines - recursive feature elimination that delineates biomolecular signatures in HNSC and OSCC. Based on these signatures, classification models with high classification and prediction accuracies were developed which highlighted the significance of selected 22 HNSC and 23 OSCC candidate genes. Further, K-means clustering supported this finding by displaying a clear demarcation of the normal and tumor classes while previous literature confirmed the importance of the biomolecular signatures in several cancers. Specifically, it has been reported that FAM107A, FAM3D and CXCR2 could be probable diagnostic or prognostic candidates while RRAGD, PPL, AQP7, SORT1, MAB21L4 and UBL3 were earlier suggested to have therapeutic roles in cancer. Three overlapping genes namely, ENDOU, RRAGD and SMIM5 could be imperative because of their commonality between HNSC and OSCC. Likewise, seven features related to plasma membrane i.e., CEACAM1, COBL, GPX3, HCG22, MUC21, PAX9 and SMIM5 were identified that could essentially be targeted as non-invasive diagnostic, prognostic or therapeutic candidates. Therefore, the HNSC and OSCC biomolecular signatures obtained as a result of the present computational approach showed promising potentials as a diagnostic or prognostic or therapeutic biomarker.