COVID-19 prevalence estimation by random sampling in population - optimal sample pooling under varying assumptions about true prevalence

Ola Brynildsrud

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (ScreenIT)

Abstract

Background

The number of confirmed COVID-19 cases divided by population size is used as a coarse measurement for the burden of disease in a population. However, this fraction depends heavily on the sampling intensity and the various test criteria used in different jurisdictions, and many sources indicate that a large fraction of cases tend to go undetected.

Methods

Estimates of the true prevalence of COVID-19 in a population can be made by random sampling and pooling of RT-PCR tests. Here I use simulations to explore how experiment sample size and degrees of sample pooling impact precision of prevalence estimates and potential for minimizing the total number of tests required to get individual-level diagnostic results.

Results

Sample pooling can greatly reduce the total number of tests required for prevalence estimation. In low-prevalence populations, it is theoretically possible to pool hundreds of samples with only marginal loss of precision. Even when the true prevalence is as high as 10% it can be appropriate to pool up to 15 samples. Sample pooling can be particularly beneficial when the test has imperfect specificity by providing more accurate estimates of the prevalence than an equal number of individual-level tests.

Conclusion

Sample pooling should be considered in COVID-19 prevalence estimation efforts.

ScreenIT
Mar 1, 2021
SciScore for 10.1101/2020.05.05.20075275: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.
Table 2: Resources
No key resources detected.
Results from OddPub: Thank you for sharing your code.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:
- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when …
SciScore for 10.1101/2020.05.05.20075275: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.
Table 2: Resources
No key resources detected.
Results from OddPub: Thank you for sharing your code.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:
Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.
About SciScore
SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.
Read the original source
Version published to 10.1186/s12874-020-01081-0
Jul 23, 2020
Version published to 10.21203/rs.3.rs-32082/v2 on Research Square
Jul 10, 2020
Version published to 10.21203/rs.3.rs-32082/v1 on Research Square
Jun 4, 2020
Version published to 10.1101/2020.05.05.20075275 on medRxiv
May 8, 2020

WITHDRAWN: Rt Transmission of Covid 19 in New Mexico

This article has 1 author:
1. OlaKunle Daniel Olorunkemi
This article has no evaluationsLatest version Mar 19, 2026
The estimated burden of rare diseases in South Africa using Orphanet: An Epidemiological Analysis

This article has 4 authors:
1. Helen Louise Malherbe
2. Sujani Odendaal
3. Ana Kukava
4. Caterina Lucano
This article has no evaluationsLatest version Mar 4, 2026
Recall bias in population-based case-control studies of ovarian cancer and genital talcum powder use: potential impact and quantitative bias analysis

This article has 5 authors:
1. Diezhang Wu
2. Igor Burstyn
3. William J. Thompson
4. Jing Qian
5. Kenneth A. Mundt
This article has no evaluationsLatest version Mar 23, 2026

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Background

Methods

Results

Conclusion

Article activity feed

Related articles

WITHDRAWN: Rt Transmission of Covid 19 in New Mexico

The estimated burden of rare diseases in South Africa using Orphanet: An Epidemiological Analysis

Recall bias in population-based case-control studies of ovarian cancer and genital talcum powder use: potential impact and quantitative bias analysis