A Universal Day Zero Infectious Disease Testing Strategy Leveraging CRISPR-based Sample Depletion and Metagenomic Sequencing

Abstract

The lack of preparedness for detecting the highly infectious SARS-CoV-2 pathogen, the pathogen responsible for the COVID-19 disease, has caused enormous harm to public health and the economy. It took ∼60 days for the first reverse transcription quantitative polymerase chain reaction (RT-qPCR) tests for SARS-CoV-2 infection developed by the United States Centers for Disease Control (CDC) to be made publicly available. It then took >270 days to deploy 800,000 of these tests at a time when the estimated actual testing needs required over 6 million tests per day. Testing was therefore limited to individuals with symptoms or in close contact with confirmed positive cases. Testing strategies deployed on a population scale at ‘Day Zero’ i.e., at the time of the first reported case, would be of significant value. Next Generation Sequencing (NGS) has such Day Zero capabilities with the potential for broad and large-scale testing. However, it has limited detection sensitivity for low copy numbers of pathogens which may be present. Here we demonstrate that by using CRISPR-Cas9 to remove abundant sequences that do not contribute to pathogen detection, NGS detection sensitivity of COVID-19 is comparable to RT-qPCR. In addition, we show that this assay can be used for variant strain typing, co-infection detection, and individual human host response assessment, all in a single workflow using existing open-source analysis pipelines. This NGS workflow is pathogen agnostic, and therefore has the potential to transform how both large-scale pandemic response and focused clinical infectious disease testing are pursued in the future.

SIGNIFICANCE STATEMENT

The lack of preparedness for detecting infectious pathogens has had a devastating effect on the global economy and society. Thus, a ‘Day Zero’ testing strategy, that can be deployed at the first reported case and expanded to population scale, is required. Next generation sequencing enables Day Zero capabilities but is inadequate for detecting low levels of pathogen due to abundant sequences of little biological interest. By applying the CRISPR-Cas system to remove these sequences in vitro , we show sensitivity of pathogen detection equivalent to RT-qPCR. The workflow is pathogen agnostic, and enables detection of strain types, co-infections and human host response with a single workflow and open-source analysis tools. These results highlight the potential to transform future large-scale pandemic response.

SciScore for 10.1101/2022.05.12.22274799: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Ethics	not detected.
Sex as a biological variable	not detected.
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Library sizes were evaluated using the Agilent BioAnalyzer 2100 and the high sensitivity dsDNA kit.	Agilent BioAnalyzer suggested: None
The Illumina sequencing adapters were removed, and low-quality bases were trimmed using AdapterRemoval (v2.3.1).	AdapterRemoval suggested: (AdapterRemoval, RRID:SCR_011834)
All remaining reads after host filtering were assigned taxonomy using Kraken2 (v2.1.1) with PlusPF database (release date: 1/27/2021).	Kraken2 suggested: None
For rRNA content estimation using Kraken2, a Kraken …

SciScore for 10.1101/2022.05.12.22274799: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Ethics	not detected.
Sex as a biological variable	not detected.
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Library sizes were evaluated using the Agilent BioAnalyzer 2100 and the high sensitivity dsDNA kit.	Agilent BioAnalyzer suggested: None
The Illumina sequencing adapters were removed, and low-quality bases were trimmed using AdapterRemoval (v2.3.1).	AdapterRemoval suggested: (AdapterRemoval, RRID:SCR_011834)
All remaining reads after host filtering were assigned taxonomy using Kraken2 (v2.1.1) with PlusPF database (release date: 1/27/2021).	Kraken2 suggested: None
For rRNA content estimation using Kraken2, a Kraken database (containing rRNA sequences from prokaryotes and eukaryotes) was built from the rRNA sequences collected from NCBI Nucleotide database using the following query: “biomol_rrna[PROP]” (as of March 17, 2021).	Kraken suggested: (Kraken, RRID:SCR_005484)
The 40M subsampled read pairs were mapped to the combined genome sequences using BWA-MEM (v0.7.17).	BWA-MEM suggested: (Sniffles, RRID:SCR_017619)
For each species, the number of mapped reads and the number of total bases mapped were collected using Bedtools (v1.9) “multicov” and Samtools (v1.9) “depth” commands, respectively, with optional parameters “-d 0 -aa” being used for Samtools “depth” command to accurately report the depths in deeply covered regions.	Bedtools suggested: (BEDTools, RRID:SCR_006646) Samtools suggested: (SAMTOOLS, RRID:SCR_002105)
AMR gene identification: The assembled contigs from the CZID workflow with 40M subsampled read pairs were retrieved and searched against AMR genes using NCBI AMRFinderPlus (v3.10.21)	NCBI AMRFinderPlus suggested: None
Within DEGenR, the raw read counts were imported, filtered, normalized using edgeR R-package to filter out any low-expressed genes.	edgeR suggested: (edgeR, RRID:SCR_012802)
The Enrichr R package (37) was used to rank enriched terms among DEGs using different databases and resources, including GO biological processes.	Enrichr suggested: (Enrichr, RRID:SCR_001575)

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Results from scite Reference Check: We found no unreliable references.

Read the original source

A Universal Day Zero Infectious Disease Testing Strategy Leveraging CRISPR-based Sample Depletion and Metagenomic Sequencing

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

SIGNIFICANCE STATEMENT

Article activity feed

Evaluating Reference-Independent Pipelines for the Detection of Spreading Organisms in Metagenomic Datasets

Library preparation strategy critically impacts RNA virus sensitivity in clinical metagenomics

Future Pandemics: AI-Designed Diagnostic Assays for Detection of Andes Orthohantavirus (ANDV) Associated with the 2026 MV Hondius Outbreak

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

SIGNIFICANCE STATEMENT

Article activity feed

Related articles

Evaluating Reference-Independent Pipelines for the Detection of Spreading Organisms in Metagenomic Datasets

Library preparation strategy critically impacts RNA virus sensitivity in clinical metagenomics

Future Pandemics: AI-Designed Diagnostic Assays for Detection of Andes Orthohantavirus (ANDV) Associated with the 2026 MV Hondius Outbreak