Machine learning and SHAP value interpretation for predicting comorbidity of periodontitis and metabolic syndrome
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Periodontitis and metabolic syndrome (MetS) frequently coexist and share inflammatory and microbiota-related mechanisms. Diet is a modifiable determinant of both conditions, yet previous studies have relied on single-nutrient approaches and linear statistics. We used interpretable machine-learning methods to identify gut microbiota–relevant dietary factors associated with periodontitis-MetS comorbidity in a nationally representative U.S. sample. Methods Cross-sectional data from 8,694 adults (NHANES 2009–2014) with complete periodontal, metabolic, dietary and demographic information were analysed. Fifteen dietary predictors—including 14 continuous food-nutrient intakes and the Dietary Index for Gut Microbiota (DI-GM)—plus nine covariates were modelled. After preprocessing (one-hot encoding, z-score standardisation, SMOTE), six algorithms (gradient boosting machine [GBM], random forest, XGBoost, LightGBM, support-vector machine, logistic regression, multilayer perceptron) were trained on 70% of the sample and validated on 30%. Performance was assessed by AUC, accuracy, F1 and calibration. SHapley Additive exPlanations (SHAP) provided global and individual feature attribution. Results Comorbidity prevalence was 16.0%. Gradient boosting models yielded the best discrimination (GBM AUC = 0.927 [95% CI 0.915–0.939]; accuracy = 0.86; F1 = 0.71). Calibration curves and decision-curve analysis confirmed clinical utility across decision thresholds. SHAP identified processed meat and coffee intake as the strongest risk contributors, followed by lower DI-GM score, low dietary fibre and high fat intake. Protective features included higher consumption of fermented dairy, whole grains, soybeans and green tea. SHAP dependence plots revealed non-linear, dose-responsive relations, with risk plateauing beyond ≈ 5 servings/day of processed meat. Conclusions An interpretable GBM model accurately predicts periodontitis–MetS comorbidity from gut microbiota–oriented dietary data. Processed meat, coffee and overall poor DI-GM score emerge as modifiable dietary targets, while fibre- and probiotic-rich foods appear protective. Although causal inference is limited by the cross-sectional design, these findings support precision-nutrition strategies and highlight the value of explainable machine learning for integrated oral-systemic disease prevention. Prospective validation is warranted.