Evaluation of a genetic risk score for severity of COVID-19 using human chromosomal-scale length variation

Christopher Toh
James P. Brody

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (ScreenIT)

Abstract

Introduction

The course of COVID-19 varies from asymptomatic to severe in patients. The basis for this range in symptoms is unknown. One possibility is that genetic variation is partly responsible for the highly variable response. We evaluated how well a genetic risk score based on chromosomal-scale length variation and machine learning classification algorithms could predict severity of response to SARS-CoV-2 infection.

Methods

We compared 981 patients from the UK Biobank dataset who had a severe reaction to SARS-CoV-2 infection before 27 April 2020 to a similar number of age-matched patients drawn for the general UK Biobank population. For each patient, we built a profile of 88 numbers characterizing the chromosomal-scale length variability of their germ line DNA. Each number represented one quarter of the 22 autosomes. We used the machine learning algorithm XGBoost to build a classifier that could predict whether a person would have a severe reaction to COVID-19 based only on their 88-number classification.

Results

We found that the XGBoost classifier could differentiate between the two classes at a significant level ( p = 2 · 10 ⁻¹¹ ) as measured against a randomized control and ( p = 3 · 10 ⁻¹⁴ ) as measured against the expected value of a random guessing algorithm (AUC = 0.5). However, we found that the AUC of the classifier was only 0.51, too low for a clinically useful test.

Conclusion

Genetics play a role in the severity of COVID-19, but we cannot yet develop a useful genetic test to predict severity.

ScreenIT
Mar 1, 2021
SciScore for 10.1101/2020.07.06.20147637: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected.
Randomization not detected.
Blinding not detected.
Power Analysis not detected.
Sex as a biological variable not detected.
Table 2: Resources
No key resources detected.
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar …
SciScore for 10.1101/2020.07.06.20147637: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected.
Randomization not detected.
Blinding not detected.
Power Analysis not detected.
Sex as a biological variable not detected.
Table 2: Resources
No key resources detected.
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:
Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.
About SciScore
SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.
Read the original source
Version published to 10.1186/s40246-020-00288-y
Oct 9, 2020
Version published to 10.21203/rs.3.rs-45354/v2 on Research Square
Sep 23, 2020
Version published to 10.21203/rs.3.rs-45354/v1 on Research Square
Jul 22, 2020
Version published to 10.1101/2020.07.06.20147637 on medRxiv
Jul 7, 2020

Development and Deployment of a Machine Learning–Based Predictive Model for COVID- 19 Infection Using Patient Demographic and Symptom Data in Nigeria

This article has 10 authors:
1. Olanrewaju Eniade
2. Ezekiel Ukwenga
3. Uchenna Akuka
4. Opeyemi Adeniyi
5. Elonna Obak
6. Omolola Adeagbo
7. Peter Babatunde Olaitan
8. Rita Ayanbolade Olowe
9. Tolulope Opakunle
10. Olugbenga Adekunle Olowe
This article has no evaluationsLatest version Jan 25, 2026
Genetic sequencing in Saudi patients with systemic lupus erythematosus

This article has 5 authors:
1. Ibrahim A. Al-Homood
2. Sarah Binhassan
3. Khalid AlMatham
4. Lena M. Hassen
5. Manar Samman
This article has no evaluationsLatest version Feb 2, 2026
Machine Learning Models in Classifying, Predicting and Managing COVID-19 Severity

This article has 10 authors:
1. Larysa Sydorchuk
2. Maksym Sokolenko
3. Miroslav Škoda
4. Denys Nevinskyi
5. Yaroslav Vyklyuk
6. Ruslan Sydorchuk
7. Alina Sokolenko
8. Ludmila Sokolenko
9. Andrii Sydorchuk
10. Oleksandr Sokolenko
This article has no evaluationsLatest version Jan 27, 2026

Institutional Review Board Statement	not detected.
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.
Sex as a biological variable	not detected.

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Introduction

Methods

Results

Conclusion

Article activity feed

Related articles

Development and Deployment of a Machine Learning–Based Predictive Model for COVID- 19 Infection Using Patient Demographic and Symptom Data in Nigeria

Genetic sequencing in Saudi patients with systemic lupus erythematosus

Machine Learning Models in Classifying, Predicting and Managing COVID-19 Severity