Development and validation of a nomogram model of lung metastasis in breast cancer based on machine learning algorithm and cytokines

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background The relationship between cytokines and lung metastasis (LM) in breast cancer (BC) remains unclear and current clinical methods for identifying breast cancer lung metastasis (BCLM) lack precision, thus underscoring the need for an accurate risk prediction model. This study aimed to apply machine learning algorithms for identifying the key risk factors for BCLM before developing a reliable prediction model centered on cytokines. Methods This population-based retrospective study included 326 BC patients admitted to the Second Affiliated Hospital of Xuzhou Medical University between September 2018 and September 2023. After randomly assigning the patients to a training cohort (70%; n = 228) or a validation cohort (30%; n = 98) the risk factors for BCLM were identified using Least Absolute Shrinkage and Selection Operator (LASSO), Extreme Gradient Boosting (XGBoost) and Random Forest (RF) models. Significant risk factors were visualized with a Venn diagram and incorporated into a nomogram model, the performance of which was then evaluated according to three criteria, namely discrimination, calibration and clinical utility using calibration plots, receiver operating characteristic (ROC) curves and decision curve analysis (DCA). Results Among the cohort, 70 patients developed LM. A nomogram was then developed to predict the 5-year and 10-year BCLM risk by incorporating five key variables, namely endocrine therapy, hsCRP, IL6, IFN-ɑ and TNF-ɑ. For the 5-year prediction model, the training and validation cohorts had AUC values of 0.786 (95% CI: 0.691–0.881) and 0.627 (95% CI: 0.441–0.813), respectively, while for the 10-year prediction model, the corresponding AUC values were 0.687 (95% CI: 0.528–0.847) and 0.797 (95% CI: 0.605–0.988), respectively. ROC analysis further confirmed the model’s strong discriminative ability, while calibration plots indicated that the predicted and observed outcomes were in good agreement in both cohorts. Finally, DCA demonstrated the model’s effectiveness in clinical practice. Conclusion Using machine learning algorithms, this study developed aa nomogram that could effectively identify BC patients who were at a higher risk of developing LM, thus providing a valuable tool for decision-making in clinical settings.

Article activity feed