Digital Metabolism Twins Reveal Latent Dietary Metabolic States for Diabetes Risk Stratification
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Digital twins have emerged as a promising paradigm for personalized medicine, yet most health-focused implementations rely on static clinical variables and lack explicit modelling of dynamic lifestyle behaviours. Dietary intake, a major modifiable determinant of metabolic risk, is typically represented using aggregated features that obscure temporal structure. Here, we propose a digital metabolism twin framework that learns latent dietary–metabolic states from short-term temporal dietary intake sequences and integrates them with clinical variables for diabetes risk stratification. Using data from the U.S. National Health and Nutrition Examination Survey (NHANES), we encode two-day dietary intake profiles with a variational autoencoder to derive low-dimensional latent representations capturing metabolic patterns. These latent states are fused with demographic, anthropometric, and behavioural features and evaluated across multiple machine learning models. Incorporation of latent dietary states improves predictive discrimination, calibration, and clinical risk stratification, with the largest gains observed for gradient boosting and CatBoost models. Our results demonstrate that diet-derived latent metabolic representations provide complementary information beyond traditional features and support the utility of digital metabolism twins for interpretable and scalable metabolic risk assessment.