Research on Constructing a prognosis prediction Model of Breast Cancer LncRNA Based on the TCGA Database
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Breast cancer remains a leading cause of cancer-related mortality in women, with challenges in prognosis prediction and therapeutic resistance. This study aimed to identify long non-coding RNAs (lncRNAs) associated with breast cancer prognosis and construct a predictive risk model. RNA-seq data from 963 breast cancer and 110 normal tissue samples were obtained from The Cancer Genome Atlas (TCGA). Differential expression analysis revealed 1,197 dysregulated lncRNAs and 1,809 mRNAs (|log2FC| >2, p < 0.01). Univariate Cox and LASSO regression analyses identified 18 prognosis-related lncRNAs, further refined to 14 key genes via multivariate Cox regression. A risk scoring model was established based on expression levels and regression coefficients of these lncRNAs (e.g., AC093515.1, WT1-AS, LINC01224). Patients stratified into high-risk (n = 481) and low-risk (n = 482) groups showed significantly different survival outcomes (p < 0.01), with the high-risk group exhibiting elevated mortality. The model demonstrated robust predictive performance, achieving an AUC of 0.731 in ROC analysis. Notably, LINC01224, AL133467.1, and MAFA-AS1 were identified as protective factors (HR < 1), while AC093515.1, WT1-AS, and LINC00668 acted as risk factors (HR > 1). These findings provide a novel lncRNA-based prognostic tool and potential therapeutic targets for breast cancer management.