Machine Learning-based Chemotoxicity Predictions in Patients with Colorectal Cancer: Integrating Race, Geospatial Social Determinants of Health, and Biological Aging
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background: Colorectal cancer patients often face chemotoxicity, impacting treatment adherence, survival, and quality of life. Early chemotoxicity screening is vital, yet comprehensive predictive models are lacking. We aimed to develop artificial intelligence(AI)/machine learning (ML)-based models to predict global, gastrointestinal (GI), and hematological chemotoxicity by incorporating racialized group, social determinants of health (SDOH, including Area Deprivation Index measuring geospatial variation) and biological aging (measured by blood-based Levine PhenoAge). Methods: We used electronic health records data from 1,735 adult CRC patients. Sociodemographic/clinical variables, Levine PhenoAge (biological aging), and SDOH (including geospatial data measured by Area Deprivation Index) were analyzed using descriptive statistics. Associations with chemotoxicity (global, GI, hematological) were evaluated via univariate tests. Significant predictors from univariate tests were selected for AI/ML modeling. Six supervised ML models were trained on 80% of cases (n=1,388), with 20% (n=347) reserved for testing. Performance was assessed via accuracy, area under the curve (AUC), and F1-score. Permutation feature importance ranked predictors to define the most significant predictors of chemotoxicity. Results: Chemotoxicity incidences over 6 months of chemotherapy were 56% (global), 41% (GI), and 23% (hematological). Support Vector Machine, followed by XGBoost models (in both training and test datasets) demonstrated high accuracy. Key predictors for global and GI toxicities included advanced biological aging (higher Levine PhenoAge), elevated inflammatory markers (e.g., C-reactive protein), and poor SDOH including geospatial variations (e.g., higher Area Deprivation Index), unemployment. Hematological toxicity was linked to lower immune markers and higher biological age (Levine PhenoAge). Race (non-Hispanic Black), body mass index, and lifestyles also influenced global and GI toxicities. Conclusions : The ML models demonstrated high accuracy in chemotoxicity prediction. Biological aging and SDOH, including ADI, and immune/inflammation markers, were common risk factors of global and GI chemotoxicities. In contrast, biological age and immune/inflammation markers were only linked to hematological chemotoxicity. Integrating these factors into predictive models can help clinicians identify at-risk patients and tailor interventions (e.g., anti-inflammatory and anti-aging strategies) to reduce chemotoxicity and improve survivorship outcomes.