Target Capture Sequencing of SARS-CoV-2 Genomes Using the ONETest Coronaviruses Plus

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Background

Genomic sequencing is important to track and monitor genetic changes in SARS-CoV-2. We introduce a target capture next-generation sequencing methodology, the ONETest Coronaviruses Plus, to sequence SARS-CoV-2 genomes and select genes of other respiratory viruses simultaneously.

Methods

We applied the ONETest on 70 respiratory samples (collected in Florida, USA between May and July, 2020), in which SARS-CoV-2 had been detected by a qualitative PCR assay. For 48 (69%) of the samples, we also applied the ARTIC protocol for Illumina sequencing. All the libraries were sequenced as 2×150 nucleotide reads on an Illumina instrument. The ONETest data were analyzed using an in-house pipeline and the ARTIC data using a published pipeline to produce consensus SARS-CoV-2 genome sequences, to which lineages were assigned using pangolin .

Results

Of the 70 ONETest libraries, 45 (64%) had a complete or near-complete SARS-CoV-2 genome sequence (> 29,000 bases and with > 90% of its bases covered by at least 10 reads). Of the 48 ARTIC libraries, 25 (52%) had a complete or near-complete SARS-CoV-2 genome sequence.

In 24 out of 34 (71%) samples in which both the ONETest and ARTIC sequences were complete or near-complete and in which lineage could be assigned to both the ONETest and ARTIC sequences, the SARS-CoV-2 lineage identified was the same.

Conclusions

The ONETest can be used to sequence the SARS-CoV-2 genomes in archived samples and thereby enable detection of circulating and emerging SARS-CoV-2 variants. Target capture approaches, such as the ONETest, are less prone to loss of sequence coverage probably due to amplicon dropouts encountered in amplicon approaches, such as ARTIC. With its added value of characterizing other major respiratory pathogens, although not assessed in this study, the ONETest can help to better understand the epidemiology of infectious respiratory disease in the post COVID-19 era.

Article activity feed

  1. SciScore for 10.1101/2021.03.25.437083: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board StatementIRB: Ethics review: Approval for this study was obtained from the University of Florida Institutional Review Board (IRB202001328).
    RandomizationUsing seqtk v1.3 (https://github/com/lh3/seqtk), we randomly down-sampled (without replacement) the 2×150 nt reads of each ONETest library so that the resulting library had the same number of reads as the matched ARTIC library; each ONETest library was sub-sampled three times in this manner to generate three simulated replicates of the library.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variableAmong the patients, 30 (43%) were male and 40 (57%) were female.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Plus RealAmp Kit from OSANG Healthcare Co. Ltd., South Korea), which targets the RdRp, N, and E genes.
    OSANG Healthcare
    suggested: None
    Reads from these libraries were analyzed using a bioinformatics pipeline (v1.3.0; https://github.com/connor-lab/ncov2019-artic-nf) that automates the ARTIC data analysis protocol for Illumina reads (https://artic.network/ncov-2019/ncov2019-bioinformatics-sop.html), which utilizes bwa mem 15, samtools 16, and iVar 17.
    samtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    Reads were discarded that mapped to the human genome sequence (GRCh38.p13, release 35) using bowtie2 v2.4.2 18.
    bowtie2
    suggested: (Bowtie 2, RRID:SCR_016368)
    The pipeline was implemented in C/C++ and Python using a combination of in-house software and third-party tools, including Biopython v1.78 19, bedtools v2.29.2 20, pybedtools v0.8.1 21, samtools/bcftools/htslib v1.11 16, and Snakemake v5.26.1 22.
    Python
    suggested: (IPython, RRID:SCR_001658)
    Biopython
    suggested: (Biopython, RRID:SCR_007173)
    bedtools
    suggested: (BEDTools, RRID:SCR_006646)
    samtools/bcftools/htslib
    suggested: None
    Snakemake
    suggested: (Snakemake, RRID:SCR_003475)
    Visualization was done in R using ggplot2 23.
    ggplot2
    suggested: (ggplot2, RRID:SCR_014601)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.