Identifying risk factors for COVID-19 severity and mortality in the UK Biobank

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Severe acute respiratory syndrome coronavirus has infected over 114 million people worldwide as of March 2021, with worldwide mortality rates ranging between 1-10%. We use information on up to 421,111 UK Biobank participants to identify possible predictors for long-term susceptibility to severe COVID-19 infection ( N =1,088) and mortality ( N =376). We include 36,168 predictors in our analyses and use a gradient boosting decision tree (GBDT) algorithm and feature attribution based on Shapley values, together with traditional epidemiological approaches to identify possible risk factors. Our analyses show associations between socio-demographic factors (e.g. age, sex, ethnicity, education, material deprivation, accommodation type) and lifestyle indicators (e.g. smoking, physical activity, walking pace, tea intake, and dietary changes) with risk of developing severe COVID-19 symptoms. Blood (cystatin C, C-reactive protein, gamma glutamyl transferase and alkaline phosphatase) and urine (microalbuminuria) biomarkers measured more than 10 years earlier predicted severe COVID-19. We also confirm increased risks for several pre-existing disease outcomes (e.g. lung diseases, type 2 diabetes, hypertension, circulatory diseases, anemia, and mental disorders). Analyses on mortality were possible within a sub-group testing positive for COVID-19 infection ( N =1,953) with our analyses confirming association between age, smoking status, and prior primary diagnosis of urinary tract infection.

SUMMARY

Our hypothesis-free approach combining machine learning with traditional epidemiological methods finds a number of risk factors (sociodemographic, lifestyle, and psychosocial factors, biomarkers, disease outcomes and treatments) associated with developing severe COVID-19 symptoms and COVID-19 mortality.

Article activity feed

  1. SciScore for 10.1101/2021.05.10.21256935: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    EthicsIRB: The UK Biobank project was approved by North West Multicenter Research Ethics Committee and the National Information Governance Board for Health and Social Care (11/NW/0382).
    Consent: Informed consent was obtained at the time of enrolment from all participants [9].
    Field Sample Permit: This study was conducted under application number 20175 to the UK Biobank.
    Sex as a biological variablenot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.

    Table 2: Resources

    No key resources detected.


    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.