Efficient and Practical Sample Pooling for High-Throughput PCR Diagnosis of COVID-19

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

In the global effort to combat the COVID-19 pandemic, governments and public health agencies are striving to rapidly increase the volume and rate of diagnostic testing. The most common form of testing today employs Polymerase Chain Reaction in order to identify the presence of viral RNA in individual patient samples one by one. This process has become one of the most significant bottlenecks to increased testing, especially due to reported shortages in the chemical reagents needed in the PCR reaction.

Recent technical advances have enabled High-Throughput PCR, in which multiple samples are pooled into one tube. Such methods can be highly efficient, saving large amounts of time and reagents. However, their efficiency is highly dependent on the frequency of positive samples, which varies significantly across regions and even within regions as testing criterion and conditions change.

Here, we present two possible optimized pooling strategies for diagnostic SARS-CoV-2 testing on large scales, both addressing dynamic conditions. In the first, we employ a simple information-theoretic heuristic to derive a highly efficient re-pooling protocol: an estimate of the target frequency determines the initial pool size, and any subsequent pools found positive are re-pooled at half-size and tested again. In the range of very rare target (<0.05), this approach can reduce the number of necessary tests dramatically, for example, achieving a reduction by a factor of 50 for a target frequency of 0.001. The second method is a simpler approach of optimized one-time pooling followed by individual tests on positive pools. We show that this approach is just as efficient for moderate target-product frequencies (0.05<0.2), for example, achieving a two-fold in the number of when the frequency of positive samples is 0.07.

These strategies require little investment, and they offer a significant reduction in the amount of materials, equipment and time needed to test large numbers of samples. We show that both these pooling strategies are roughly comparable to the absolute upper-bound efficiency given by Shannon’s source coding theorem. We compare our strategies to the naïve way of testing and to alternative matrix-pooling methods. Most importantly, we offer straightforward, practical pooling instructions for laboratories that perform large scale PCR assays to diagnose SARS-CoV-2 viral particles. These two pooling strategies may offer ways to alleviate the bottleneck currently preventing massive expansion of SARS-CoV-2 testing around the world.

Article activity feed

  1. SciScore for 10.1101/2020.04.06.20052159: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board Statementnot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.

    Table 2: Resources

    No key resources detected.


    Results from OddPub: Thank you for sharing your code and data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    It is important to note that technical limitations may limit the maximal batch size. This is because the process of pooling multiple patient samples into one tube inevitably causes dilution of the RNA of each individual sample. While it was shown empirically that 64 samples could be pooled together to a combined sample that contains enough RNA copies for detection2, further empirical work should be conducted in order to determine the maximal pool possible. For very large pools, improvement could be achieved with a minor change in the existing protocol (e.g. extracting higher concentrations of RNA content perhaps at the expense of some background reagents). Nevertheless, we chose here to show the theoretical optimal batch size, even if its feasibility is still somewhat questionable. To keep our methods implementable immediately, we calculate their performance also for a constrained batch size and present these in the practical tables and protocols. Note, as we write in our protocols, that even moderate batch sizes require an appropriate adjustment of the cycle-threshold for detection. We hope this study will assist to increase the number of tests thus improving local governments’ and agencies’ ability to monitor and prevent the spread of COVID-19.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.