An explainable predictive machine learning model of oxaliplatin induced peripheral neuropathy based on clinical data: a retrospective single center
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Oxaliplatin, a third-generation platinum-based antineoplastic agent, is widely used in the treatment of gastrointestinal malignancies such as colorectal cancer. However, oxaliplatin-induced peripheral neuropathy (OIPN) is a common and distinctive adverse effect, with a high incidence rate. Characterized by numbness and paresthesia in the extremities, OIPN is dose-limiting, often irreversible, and significantly impacts patients' quality of life. Current assessment relies primarily on subjective symptoms, and effective predictive models are lacking. Methods This single-center retrospective cohort study included 829 colorectal cancer patients receiving oxaliplatin chemotherapy. Fourteen core features were screened from 104 potential variables using Lasso regression, Boruta algorithm, REFCV, and GBDT. Five machine learning models (XGBoost, Random Forest, AdaBoost, GBDT, and GNB) were developed and evaluated. Model optimization was performed using 5-fold cross-validation, and performance was assessed via metrics including the area under the receiver operating characteristic curve (AUC), calibration curve, and decision curve analysis (DCA). SHAP analysis was employed to interpret the model, and an online risk calculator was developed. Results The GBDT model demonstrated the best performance, with an AUC of 0.997 (95% CI: 0.994–1.000) in the training set, 0.908 (0.854–0.962) in the validation set, and 0.892 (0.837–0.947) in the external test set. High calibration accuracy was observed, and DCA showed significant net benefit. SHAP analysis identified Total-OXA, BMI, CEA, APOA-1, Sex, and ETCO as the top six core predictors for OIPN, with Total-OXA exhibiting the most significant and dose-dependent impact. Conclusion This study demonstrates that the GBDT machine learning model effectively predicts OIPN risk in colorectal cancer patients. Combined with SHAP analysis, the model's interpretability is enhanced. The developed online calculator provides a reliable tool for early clinical identification of high-risk patients and personalized intervention strategies.