An Explainable and Transparent Machine Learning Approach for Predicting Dental Caries: A Cross-National Validation Study

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

BACKGROUND There has been a notable increase in artificial intelligence (AI) studies in dentistry. However, the inadequate use of proper validation methods has led to overly optimistic performance metrics of machine learning (ML) models. External validation provides evidence of a ML model’s performance with independent datasets and is crucial for generalizability. METHODS We developed Extreme Gradient Boosting (XGBoost) models to detect dental caries using easy-to-collect questionnaire data. ML model training was conducted using cross-validation nested resampling with a holdout test set, utilizing NHANES datasets (n = 6070). Performance of the trained model was tested using external data from the Northern Finland Birth Cohorts (NFBC1966 and NFBC1986; n = 3616). To enhance interpretability, beeswarm plots were constructed to visualize variable importance. RESULTS The ML model demonstrated solid performance in predicting dental caries on the internal dataset, with an area under the operating characteristics curve (AUC) of 0.821 (95% CI 0.795–0.846). However, the model encountered difficulties in identifying participants with dental caries, as shown by its poor sensitivity of 0.420, despite achieving a high specificity of 0.916. When applied to the external dataset, the ML model encountered significant challenges, with the AUC dropping to 0.550 (95% CI 0.532–0.569), sensitivity decreasing to 0.053, and specificity slightly improving to 0.974. Important variables identified by the model included were self-rated condition of teeth and gums, presence of missing teeth, financial status, and time since last dental visit. CONCLUSION The performance of our ML model during external validation degraded notably compared to the internal validation. However, the XAI methodology exhibited great potential to be used in the future for individualized dental caries risk assessment

Article activity feed