Development and validation of a multivariable Prediction Model for Pre-diabetes and Diabetes using Easily Obtainable Clinical Data

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Importance

In the US, pre-diabetes and diabetes are increasing in prevalence alongside other chronic diseases. Hemoglobin A1c is the most common diagnostic test for diabetes performed in the US, but it has known inaccuracies in the setting of other chronic diseases.

Objective

To determine if easily obtained clinical data could be used to improve the diagnosis of pre-diabetes and diabetes compared to hemoglobin A1c alone.

Design, Setting, and Participants

This cross-sectional study analyzed nationally representative data obtained from six 2-year cycles (2005 to 2006 through 2015 to 2016) of the National Health and Nutrition Examination Survey in the US. We excluded participants without hemoglobin A1c, oral glucose tolerance test, or sample weight data. The sample comprised 13,800 survey participants. Data analyses were performed from May 1, 2024 to February 9, 2025.

Main Outcomes and Measures

We estimated 2-hour glucose from a gradient boosted machine decision tree machine learning model to diagnose pre-diabetes and diabetes as defined by oral glucose tolerance test 2-hour glucose of greater than or equal to 140 mg/dL but less than 200 mg/dL and greater than or equal to 200 mg/dL, respectively. We compared the area-under-the-receiver-operating-curve (AUROC), the calibration, positive predictive value, and the net benefit by decision curve analysis to hemoglobin A1C alone.

Results

A 20-feature Model outperformed the hemoglobin A1c and fasting plasma glucose for diagnosis, with AUROC improvement from 0.66/0.71 to 0.77 for pre-diabetes and from 0.87/0.88 to 0.91 for diabetes. The Model also had improved positive predictive value compared to the A1c for diagnosis and for net benefit on decision curve analysis. Main features that improved diagnosis of pre-diabetes and diabetes were the standard vitals: age, height, weight, waist circumference, blood pressure, pulse, the fasting labs plasma glucose, insulin, triglycerides, and iron, the non-fasting labs cholesterol, gamma-glutamyl transferase, creatinine, platelet count, segmented neutrophil percentage, urine albumin, and urine creatinine, and the social determinant of health factor Poverty Ratio.

Conclusions and Relevance

In this cross-sectional study of NHANES participants, we identified risk factors that could be incorporated into the electronic medical record to identify patients with potentially undiagnosed pre-diabetes and diabetes. Implementation could improve diagnosis and lead to earlier intervention on disease before it becomes severe and complications develop.

Key Points

Question

Can readily-available clinical data improve diagnosis of pre-diabetes and diabetes compared to hemoglobin A1c testing alone?

Findings

In this cross-sectional study of 13,800 adults with paired hemoglobin A1c and oral glucose tolerance testing in the National Health and Nutrition Examination Survey, the rate of pre-diabetes undiagnosed by 8.6% and rate of diabetes undiagnosed by the hemoglobin A1c was 3.5%. A novel multivariable prediction model that included fasting plasma glucose, insulin, basic body measurements, and routinely available dyslipidemia and hepatic function labs for was significantly more accurate (AUROC 0.66/0.71 to 0.77 for pre-diabetes, 0.87/0.88 to 0.91 for diabetes) than hemoglobin A1C or fasting plasma glucose alone.

Meaning

Incorporation of easily obtainable clinical data can improve diagnosis of pre-diabetes and diabetes compared to hemoglobin A1C alone.

Article activity feed