Improving population-scale disease prediction through multi-omics integration
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The early detection of common diseases is currently constrained by a dependence on single biomarkers, which often capture only a limited aspect of disease pathology. Here, we applied multi-omic factor analysis to integrate plasma proteins, metabolites, biochemical and haematological measurements from 46,081 individuals in the UK Biobank Pharma Proteomics Project. We identified 20 latent factors that capture major axes of biological variation across omic layers and found that these factors showed 400 significant associations with 111 incident diseases. We then incorporated the top 50 multi-omic features per factor into predictive models and observed improved disease discrimination for 85% of conditions compared with the best single biomarker. For example, the multi-omic models for predicting incident diabetes (C-index = 0.865, 95%CI: 0.850-0.882) and iron-deficiency anaemia (C-index = 0.750, 95%CI: 0.726 - 0.778) significantly outperformed their respective clinical biomarkers, HbA1c (C-index = 0.810, 95%CI: 0.791-0.831) and haemoglobin concentration (C-index = 0.668, 95%CI: 0.644-0.692). We therefore conclude that population-scale multi-omic integration reveals shared disease mechanisms and enhances risk prediction, providing a framework for earlier and biologically informed diagnosis and prevention.