Radiomic Features and Carotid Stenosis in Periodontitis: A Two-Stage Bootstrap and Multimodal Machine Learning Study for Oral-Vascular Research

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objectives This study aims to develop and validate a deep learning model based on jaws Cone Beam Computed Tomography (CBCT) radiomic features to achieve early detection of potential carotid atherosclerosis in periodontitis patients. Materials and methods The study utilised data from 279 observations, each with 206 features, to distinguish between periodontitis patients with and without concomitant carotid atherosclerosis. To address class imbalance, SMOTE oversampling was applied (dup_size=1), increasing the sample size to 390 observations. A bootstrap method (n_bootstrap=1000) was employed for feature selection. In each iteration, a dataset was created by resampling with replacement. Features were first filtered using Spearman’s rank correlation to remove redundant variables (correlation coefficient >0.8), followed by Lasso regression with ten-fold cross-validation to select predictive variables based on non-zero coefficients. High-frequency features identified through 1000 iterations underwent a second round of bootstrap analysis, where Logistic Regression combined with the Akaike Information Criterion (AIC) was used to determine the final variable set. This rigorous process ensured optimal feature selection for developing an effective early detection model for carotid atherosclerosis in periodontitis patients. Results The study analyzed data from 279 observations, with each observation characterized by 206 features, to differentiate between periodontitis patients with concurrent carotid atherosclerosis and those without. After SMOTE oversampling, the dataset was increased to 390 observations. Feature selection through bootstrap methods identified 26 high-frequency features (>500 times), which were further refined to a final set of 20 features using Logistic Regression combined with AIC. Three machine learning models—Logistic Regression (LR), Support Vector Machine (SVM), and Random Forest (RF)—were developed and evaluated using five-fold cross-validation. The best-performing model was the RF model, achieving an AUC of 0.892, sensitivity of 0.957, specificity of 0.710, and accuracy of 0.859. ROC curves and calibration plots demonstrated good predictive performance and model calibration across all three models. Decision curve analysis showed that the RF model provided the highest net benefit across a range of risk thresholds, indicating its potential for clinical utility in early detection of carotid atherosclerosis in periodontitis patients. Conclusion This study developed a random forest model using jaws CBCT radiomics to detect carotid atherosclerosis in periodontitis patients early. After rigorous feature selection and five-fold cross-validation, it achieved an AUC of 0.892, with sensitivity of 0.957 and specificity of 0.710. The model shows high predictive performance and clinical utility, offering an effective tool for early detection.

Article activity feed