Multinational, Calibrated, Non-Laboratory Prevalent Disease Prediction and Survival Modeling for Diabetes, CKD, and CVD

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Reliable non-laboratory tools for assessment of probability of prevalent disease (PPD) are essential for scalable prevention, yet existing models are typically specific to single diseases, require laboratory tests, and show no or limited calibration across PPD strata, limiting scaling and public health utilization. We developed and validated a unified, non-invasive machine-learning model for simultaneous prediction of diabetes, chronic kidney disease, and cardiovascular disease PPD non-invasive predictors. The model was trained on 2011–2016 National Health and Nutrition Examination Survey data (n = 29,903) and evaluated on an independent 2017–2020 test set (n = 15,559). It demonstrated moderate-to-strong discrimination (C-statistic = 0.80–0.90), stable precision–recall performance, and moderate to strong calibration (slope > 0.94). Validation in an independent Korean population showed no or minimal degradation in discrimination and calibration performance, though more extensive validation is warranted. Predicted PPD was associated with cause-specific mortality over up to 7 years of follow-up, consistent with a predictor of latent disease burden. Each 10-percentage-point increase in predicted PPD was associated with roughly a two-fold higher hazard of disease-specific death (HR 2.00–2.20). We conclude that this model has potential as scalable, low-burden screening/surveillance aid, but note that it is not intended as a diagnostic or prognostic tool.

Article activity feed