Dysregulated Immune Proteins in Plasma in the UK Biobank Predict Multiple Myeloma 12 years Before Clinical Diagnosis
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
High-throughput proteomics has emerged as a potentially rich data source to improve capacity to forecast disease. This study explores the utility of plasma proteomics for identifying novel predictors of Multiple myeloma (MM), combining machine learning with statistical approaches. Utilising data from the UK Biobank, including proteomic profiles of over 50k participants, we applied an “extreme gradient boosting” (XGBoost) algorithm with SHapley Additive exPlanation (SHAP) feature-importance measures to identify key proteomic biomarkers to predict onset of MM. At least seven of the top 10 identified proteins are related to immune function and activation of lymphoid cells; two are validated MM targets with approved therapies. The top 10 proteins along with key clinical predictors were further analysed using Cox proportional hazards models to assess their contribution to incident MM risk. 10 proteomic biomarkers ranked by SHAP value substantially outperformed traditional clinical predictors. This superior performance was maintained over the 12-year follow-up period, demonstrating the predictive ability of these proteomic biomarkers for early detection of MM. The demonstration of the dysregulated expression of proteins in serum from healthy individuals, if confirmed in prospective cohorts and independent datasets, could lead to novel approaches to screening for MM and precursor conditions.