Obtaining Prevalence Estimates of Coronavirus Disease 2019: A Model to Inform Decision-Making

Abstract

We evaluated whether randomly sampling and testing a set number of individuals for coronavirus disease 2019 (COVID-19) while adjusting for misclassification error captures the true prevalence. We also quantified the impact of misclassification error bias on publicly reported case data in Maryland. Using a stratified random sampling approach, 50,000 individuals were selected from a simulated Maryland population to estimate the prevalence of COVID-19. We examined the situation when the true prevalence is low (0.07%–2%), medium (2%–5%), and high (6%–10%). Bayesian models informed by published validity estimates were used to account for misclassification error when estimating COVID-19 prevalence. Adjustment for misclassification error captured the true prevalence 100% of the time, irrespective of the true prevalence level. When adjustment for misclassification error was not done, the results highly varied depending on the population’s underlying true prevalence and the type of diagnostic test used. Generally, the prevalence estimates without adjustment for misclassification error worsened as the true prevalence level increased. Adjustment for misclassification error for publicly reported Maryland data led to a minimal but not significant increase in the estimated average daily cases. Random sampling and testing of COVID-19 are needed with adjustment for misclassification error to improve COVID-19 prevalence estimates.

SciScore for 10.1101/2020.08.06.20169656: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Antibodies
Sentences	Resources
We define ‘past COVID-19 infection’ as an individual who would test positive for IgG, IgM and IgG, or total antibody through a serological test for a COVID-19 antibody response due to past exposure to SARS-CoV-2.	SARS-CoV-2 suggested: None
Software and Algorithms
Sentences	Resources
All data were analyzed using RStudio Version 1.2.5042 and R Foundation for Statistical Computing 4.0.0.	RStudio suggested: (RStudio, RRID:SCR_000432)

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from Limitat…

SciScore for 10.1101/2020.08.06.20169656: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Antibodies
Sentences	Resources
We define ‘past COVID-19 infection’ as an individual who would test positive for IgG, IgM and IgG, or total antibody through a serological test for a COVID-19 antibody response due to past exposure to SARS-CoV-2.	SARS-CoV-2 suggested: None
Software and Algorithms
Sentences	Resources
All data were analyzed using RStudio Version 1.2.5042 and R Foundation for Statistical Computing 4.0.0.	RStudio suggested: (RStudio, RRID:SCR_000432)

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:

There are some limitations in our study. First, our study used simulated data to randomly select individuals and estimate the prevalence of active and past COVID-19 in Maryland while adjusting for misclassification error. As a result, our analysis does not reflect some additional biases that may arise in the field, such as non-response bias from individuals refusing to be tested. Any implementation of our proposed sampling and analysis approach in the field would need further adjustment for non-response bias. The advantage, however, of simulated data was our ability to determine whether the random sampling of individuals followed by adjustment for misclassification error bias in the absence of other biases captured our simulated prevalence of active and past COVID-19. This level of comparison would not be feasible with real data. Second, our simulated prevalence of 0.5% and 1% for active and past COVID-19, respectively, may not be correct for Maryland. Since it is impossible to validate these values given the true prevalence is unknown, we decided to choose low, yet, realistic prevalence values to implement this study. Third, we did not know which exact tests are most generally used in Maryland to diagnose individuals so we used published sensitivity and specificity values to adjust our model. Therefore, our results should be interpreted with caution and the data should be re-analyzed with the sensitivity and specificity values specific to the diagnostic tests used if availab...

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Read the original source

Obtaining Prevalence Estimates of Coronavirus Disease 2019: A Model to Inform Decision-Making

This article has been Reviewed by the following groups

Listed in

Abstract

Article activity feed

Dealing with differential misclassification of an outcome or a covariate in association studies with an internally validated sample. Application to the use of a serological test for the diagnosis of SARS-CoV-2 infection

ESTIMATING THE INCIDENCE OF SARS-COV-2 INFECTIONS IN 2020 IN BELGIUM BY JOINTLY MODELLING SEROPREVALENCE, HOSPITALIZATION AND MORTALITY DATA

Family history of cancer as a risk factor for Long COVID among United States adults

This article has been Reviewed by the following groups

Listed in

Abstract

Article activity feed

Related articles

Dealing with differential misclassification of an outcome or a covariate in association studies with an internally validated sample. Application to the use of a serological test for the diagnosis of SARS-CoV-2 infection

ESTIMATING THE INCIDENCE OF SARS-COV-2 INFECTIONS IN 2020 IN BELGIUM BY JOINTLY MODELLING SEROPREVALENCE, HOSPITALIZATION AND MORTALITY DATA

Family history of cancer as a risk factor for Long COVID among United States adults