Interpretable machine learning models for predicting tocilizumab response in rheumatoid arthritis using clinical data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Introduction: Approximately 30% of patients with rheumatoid arthritis (RA) do not respond to tocilizumab (TCZ). This study aimed to predict TCZ response using machine learning models trained on clinical data. Methods Baseline and follow-up data from patients with RA treated with TCZ were collected. Seven different machine learning models (logistic regression, K-nearest neighbor algorithm, random forest, support vector machine, decision tree, gradient boosting, and boosting algorithm) were trained to predict the treatment response of patients with RA across different imaging stages after three to six months of therapy. The area under the receiver operating curve (AUC) was the main performance evaluation feature to screen the best model, and the relative importance of each variable in the model was ranked using the Shapley Additive Explanation (SHAP) value. Results A total of 245 RA patients treated with TCZ were included. The logistic regression model demonstrated superior prediction performance across imaging stages, achieving AUC scores of 0.78 for patients without imaging changes, 0.73 for stage I, and 0.82 for stages II, III, and IV. Key features influencing predictions varied by imaging stage. The most important features were D-dimer concentration for patients without imaging changes, DAS28 grade for patients in stage I, and the physician's overall disease score (EGA) for patients in stages II, III, and IV, with SHAP values of 0.5180, 0.6661, and 1.3978, respectively. Discussion This study demonstrates the potential of machine learning to predict TCZ treatment outcomes and identify stage-specific features in patients with RA.