A Universal Day Zero Infectious Disease Testing Strategy Leveraging CRISPR-based Sample Depletion and Metagenomic Sequencing

Read the full article See related articles


The lack of preparedness for detecting the highly infectious SARS-CoV-2 pathogen, the pathogen responsible for the COVID-19 disease, has caused enormous harm to public health and the economy. It took ∼60 days for the first reverse transcription quantitative polymerase chain reaction (RT-qPCR) tests for SARS-CoV-2 infection developed by the United States Centers for Disease Control (CDC) to be made publicly available. It then took >270 days to deploy 800,000 of these tests at a time when the estimated actual testing needs required over 6 million tests per day. Testing was therefore limited to individuals with symptoms or in close contact with confirmed positive cases. Testing strategies deployed on a population scale at ‘Day Zero’ i.e., at the time of the first reported case, would be of significant value. Next Generation Sequencing (NGS) has such Day Zero capabilities with the potential for broad and large-scale testing. However, it has limited detection sensitivity for low copy numbers of pathogens which may be present. Here we demonstrate that by using CRISPR-Cas9 to remove abundant sequences that do not contribute to pathogen detection, NGS detection sensitivity of COVID-19 is comparable to RT-qPCR. In addition, we show that this assay can be used for variant strain typing, co-infection detection, and individual human host response assessment, all in a single workflow using existing open-source analysis pipelines. This NGS workflow is pathogen agnostic, and therefore has the potential to transform how both large-scale pandemic response and focused clinical infectious disease testing are pursued in the future.


The lack of preparedness for detecting infectious pathogens has had a devastating effect on the global economy and society. Thus, a ‘Day Zero’ testing strategy, that can be deployed at the first reported case and expanded to population scale, is required. Next generation sequencing enables Day Zero capabilities but is inadequate for detecting low levels of pathogen due to abundant sequences of little biological interest. By applying the CRISPR-Cas system to remove these sequences in vitro , we show sensitivity of pathogen detection equivalent to RT-qPCR. The workflow is pathogen agnostic, and enables detection of strain types, co-infections and human host response with a single workflow and open-source analysis tools. These results highlight the potential to transform future large-scale pandemic response.

Article activity feed

  1. SciScore for 10.1101/2022.05.12.22274799: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Ethicsnot detected.
    Sex as a biological variablenot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.

    Table 2: Resources

    Software and Algorithms
    Library sizes were evaluated using the Agilent BioAnalyzer 2100 and the high sensitivity dsDNA kit.
    Agilent BioAnalyzer
    suggested: None
    The Illumina sequencing adapters were removed, and low-quality bases were trimmed using AdapterRemoval (v2.3.1).
    suggested: (AdapterRemoval, RRID:SCR_011834)
    All remaining reads after host filtering were assigned taxonomy using Kraken2 (v2.1.1) with PlusPF database (release date: 1/27/2021).
    suggested: None
    For rRNA content estimation using Kraken2, a Kraken database (containing rRNA sequences from prokaryotes and eukaryotes) was built from the rRNA sequences collected from NCBI Nucleotide database using the following query: “biomol_rrna[PROP]” (as of March 17, 2021).
    suggested: (Kraken, RRID:SCR_005484)
    The 40M subsampled read pairs were mapped to the combined genome sequences using BWA-MEM (v0.7.17).
    suggested: (Sniffles, RRID:SCR_017619)
    For each species, the number of mapped reads and the number of total bases mapped were collected using Bedtools (v1.9) “multicov” and Samtools (v1.9) “depth” commands, respectively, with optional parameters “-d 0 -aa” being used for Samtools “depth” command to accurately report the depths in deeply covered regions.
    suggested: (BEDTools, RRID:SCR_006646)
    suggested: (SAMTOOLS, RRID:SCR_002105)
    AMR gene identification: The assembled contigs from the CZID workflow with 40M subsampled read pairs were retrieved and searched against AMR genes using NCBI AMRFinderPlus (v3.10.21)
    NCBI AMRFinderPlus
    suggested: None
    Within DEGenR, the raw read counts were imported, filtered, normalized using edgeR R-package to filter out any low-expressed genes.
    suggested: (edgeR, RRID:SCR_012802)
    The Enrichr R package (37) was used to rank enriched terms among DEGs using different databases and resources, including GO biological processes.
    suggested: (Enrichr, RRID:SCR_001575)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.

    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

    Results from JetFighter: We did not find any issues relating to colormaps.

    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.