Protocol for the development and evaluation of a tool for predicting risk of short-term adverse outcomes due to COVID-19 in the general UK population

Julia Hippisley-Cox
Ash K. Clift
Carol Coupland
Ruth Keogh
Karla Diaz-Ordaz
Elizabeth Williamson
Ewen M. Harrison
Andrew Hayward
Harry Hemingway
Peter Horby
Nisha Mehta
Jonathan Benger
Kamlesh Khunti
David Speigelhalter
Aziz Sheikh
Jonathan Valabhji
Ronan A. Lyons
John Robson
Calum Semple
Frank Kee
Peter Johnson
Susan Jebb
Tony Williams
David Coggon

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (ScreenIT)

Abstract

Introduction

Novel coronavirus 2019 (COVID-19) has propagated a global pandemic with significant health, economic and social costs. Emerging emergence has suggested that several factors may be associated with increased risk from severe outcomes or death from COVID-19. Clinical risk prediction tools have significant potential to generate individualised assessment of risk and may be useful for population stratification and other use cases.

Methods and analysis

We will use a prospective open cohort study of routinely collected data from 1205 general practices in England in the QResearch database. The primary outcome is COVID-19 mortality (in or out-of-hospital) defined as confirmed or suspected COVID-19 mentioned on the death certificate, or death occurring in a person with SARS-CoV-2 infection between 24 ^th January and 30 ^th April 2020. Our primary outcome in adults is COVID-19 mortality (including out of hospital and in hospital deaths). We will also examine COVID-19 hospitalisation in children. Time-to-event models will be developed in the training data to derive separate risk equations in adults (19-100 years) for males and females for evaluation of risk of each outcome within the 3-month follow-up period (24 ^th January to 30 ^th April 2020), accounting for competing risks. Predictors considered will include age, sex, ethnicity, deprivation, smoking status, alcohol intake, body mass index, pre-existing medical co-morbidities, and concurrent medication. Measures of performance (prediction errors, calibration and discrimination) will be determined in the test data for men and women separately and by ten-year age group. For children, descriptive statistics will be undertaken if there are currently too few serious events to allow development of a risk model. The final model will be externally evaluated in (a) geographically separate practices and (b) other relevant datasets as they become available.

Ethics and dissemination

The project has ethical approval and the results will be submitted for publication in a peer-reviewed journal.

Strengths and limitations of the study

The individual-level linkage of general practice, Public Health England testing, Hospital Episode Statistics and Office of National Statistics death register datasets enable a robust and accurate ascertainment of outcomes
The models will be trained and evaluated in population-representative datasets of millions of individuals
Shielding for clinically extremely vulnerable was advised and in place during the study period, therefore risk predictions influenced by the presence of some ‘shielding’ conditions may require careful consideration

ScreenIT
Jul 2, 2020
SciScore for 10.1101/2020.06.28.20141986: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.
Table 2: Resources
No key resources detected.
Results from OddPub: Thank you for sharing your code.
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
The strengths and limitations of the approach have already been discussed in detail 8,11,23,24,41,42. In summary, key strengths include size, wealth of data on risk factors, good ascertainment of outcomes through multiple record linkage, prospective recording of outcomes, use of an established validated database which has been used to develop many risk prediction tools, and lack of selection, recall and respondent bias and robust analysis. UK …
SciScore for 10.1101/2020.06.28.20141986: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.
Table 2: Resources
No key resources detected.
Results from OddPub: Thank you for sharing your code.
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
The strengths and limitations of the approach have already been discussed in detail 8,11,23,24,41,42. In summary, key strengths include size, wealth of data on risk factors, good ascertainment of outcomes through multiple record linkage, prospective recording of outcomes, use of an established validated database which has been used to develop many risk prediction tools, and lack of selection, recall and respondent bias and robust analysis. UK general practices have good levels of accuracy and completeness in recording clinical diagnoses and prescribed medications 43. We think our study has good face validity since it has been conducted in the setting where most patients in the UK are assessed, treated and followed up. Limitations: Limitations of our study include the lack of formal adjudication of diagnoses, potential for misclassification of outcomes depending on testing, information bias, and potential for bias due to missing data. Our database has linked mortality and hospital admissions data and is therefore likely to have picked up the great majority of COVID-19 related ICU admissions and death thereby minimising ascertainment bias. The initial evaluation will be done on a separate set of practices and individuals to those which were used to develop the score although the practices all use the same GP clinical computer system (EMIS – the computer system used by 55% of UK GPs). An independent evaluation will be a more stringent test and should be done (e.g. using data fro...
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:
Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.
About SciScore
SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.
Read the original source
Version published to 10.1101/2020.06.28.20141986 on medRxiv
Jun 29, 2020

A Preliminary Prognostic Model for Predicting Poor Prognosis in COVID-19 Integrating Lung Epithelial Injury (KL-6) with Routine Care Markers

This article has 7 authors:
1. Yunlai Liang
2. Kun Wang
3. Lu Long
4. Qizhuo Hou
5. Wenze Yu
6. Kangkang Huang
7. Bin Yi
This article has no evaluationsLatest version Feb 3, 2026
Early Risk Stratification in Hospitalized Community-Acquired UTI: An 8-Item Bedside Score for Bacteremia and 30-Day Mortality

This article has 2 authors:
1. Cihan Semet
2. Yusuf Görgülü
This article has no evaluationsLatest version Jan 1, 2026
Characterization of Long COVID by Clinical Examination and Self-Perceived Severity Stratified by Infection Wave: Beyond COVID, a Prospective, Multicenter Cohort Study in Germany

This article has 24 authors:
1. Alexander Mertens
2. Judith Smith
3. Ingmar Bergs
4. Julia Fischer
5. Sarah Jansen
6. Sven Breitschwerdt
7. Elisabeth Pracht
8. Alexander Killer
9. Lukas Schipper
10. Nils Kuklik
11. Mirjam Frank
12. Julia Schwichtenberg
13. Siona Bührmann
14. Lina Zeissler
15. Michael Dreher
16. Jürgen Rockstroh
17. Hana Rohn
18. Clara Lehmann
19. Phil-Robin Tepasse
20. Börge Schmidt
21. Nico Dragano
22. Johannes Bode
23. Tom Lüdde
24. Björn-Erik Ole Jensen
This article has no evaluationsLatest version Jan 5, 2026

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Introduction

Methods and analysis

Ethics and dissemination

Strengths and limitations of the study

Article activity feed

Related articles

A Preliminary Prognostic Model for Predicting Poor Prognosis in COVID-19 Integrating Lung Epithelial Injury (KL-6) with Routine Care Markers

Early Risk Stratification in Hospitalized Community-Acquired UTI: An 8-Item Bedside Score for Bacteremia and 30-Day Mortality

Characterization of Long COVID by Clinical Examination and Self-Perceived Severity Stratified by Infection Wave: Beyond COVID, a Prospective, Multicenter Cohort Study in Germany