Explainable Machine Learning Model for Predicting Early Neurological Deterioration in Patients with Acute Ischemic Stroke
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Objective This study aims to develop and validate predictive models for early neurological deterioration in patients with acute ischemic stroke using multiple machine learning methods. Methods A total of 1,285 patients with ischemic stroke admitted to Yijishan Hospital of Anhui Province from November 2020 to November 2024 were enrolled. The patients were randomly divided into a training set (70%) and a validation set (30%). Potential predictors were selected using a combination of Lasso regression and the Boruta algorithm. Seven machine learning algorithms—logistic regression, decision tree, random forest, XGBoost, k-nearest neighbor, light gradient boosting machine, and naïve Bayes—were used to build predictive models. Model performance was evaluated using AUC, sensitivity, specificity, and other indicators. Results Lasso regression and the Boruta algorithm jointly identified nine potential predictors: history of hypertension, TACI(Total Anterior Circulation Infarct), LACI(lacunar anterior circulation infarct), SII(Systemic Inflammatory Response Index), ARC(Acute-to-chronic glycemic ratio), HDL-C, LDL-C, ALB, and NIHSS . All seven machine learning models demonstrated good performance in both the training and validation sets. Among them, the XGBoost model performed best in the validation set, with an AUC of 0.881 (95% CI: 0.834–0.928), sensitivity of 0.746, and specificity of 0.874, showing superior overall predictive ability compared to other models. Decision curve analysis (DCA) and calibration plots indicated excellent clinical benefit and discrimination ability. Finally, the SHAP summary plot was used to visualize and interpret the XGBoost model. Conclusion This study successfully developed the first machine learning-based predictive model for progressive ischemic stroke. Through model comparison and explainability analysis, the XGBoost model demonstrated superior predictive accuracy and clinical applicability, providing a reliable tool for early intervention.