Development and Interpretability Analysis of a Machine Learning-Based Model for Predicting Early Liver Metastasis Risk After Pancreatic Cancer Surgery

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Pancreatic ductal adenocarcinoma (PDAC) is a highly lethal malignancy. Despite undergoing radical surgical resection, patients are still at a high risk of recurrence and distant metastasis postoperatively. Among the organs prone to hematogenous metastasis, the liver is the most common site, and liver metastasis significantly shortens the survival period, becoming a key factor influencing prognosis. Objective This study aims to develop an interpretable machine learning model based on postoperative clinical variables to predict the risk of liver metastasis within one year after surgery in pancreatic cancer patients. Methods This study included data from 418 patients who underwent radical pancreatic cancer surgery at the Department of Gastrointestinal Surgery, First Affiliated Hospital of Xinjiang Medical University, between January 2015 and August 2024. The data were randomly divided into a training set and a test set in a 7:3 ratio. The performance of seven machine learning models was evaluated using metrics such as the area under the receiver operating characteristic curve (AUC-ROC) and the area under the precision-recall curve (AP-PR). SHAP and LIME methods were used to determine feature importance and explain the best-performing model. Results After applying inclusion and exclusion criteria, 363 patients were included in the study. Among them, 118 patients (32.5%) developed liver metastasis within one year postoperatively. The final model incorporated 10 variables: chemotherapy status, tumor differentiation, vascular invasion (arterial/venous), hepatitis B infection, CA19-9 level, T stage, lymphocyte count, albumin level, alkaline phosphatase, and tumor size. Among the seven machine learning models, the Extra Trees (ET) model performed the best, achieving an AUC-ROC of 0.82 (95% CI: 0.73–0.90) and an average precision (AP-PR) of 0.77 in the test set. SHAP analysis revealed that postoperative chemotherapy, tumor differentiation, hepatic artery/portal vein invasion, and hepatitis B virus infection were the most influential predictors of liver metastasis. Conclusion An interpretable machine learning model was developed using postoperative clinical data, demonstrating good performance and interpretability. The model effectively predicts the risk of liver metastasis within one year after pancreatic cancer surgery. It holds promise as an auxiliary tool for postoperative follow-up and personalized interventions, providing clinicians with more precise decision-making support through feature contribution analysis.

Article activity feed