CoronaHiT: High throughput sequencing of SARS-CoV-2 genomes

This article has been Reviewed by the following groups

Read the full article

Abstract

The COVID-19 pandemic has spread to almost every country in the world since it started in China in late 2019. Controlling the pandemic requires a multifaceted approach including whole genome sequencing to support public health interventions at local and national levels. One of the most widely used methods for sequencing is the ARTIC protocol, a tiling PCR approach followed by Oxford Nanopore sequencing (ONT) of up to 96 samples at a time. There is a need, however, for a flexible, platform agnostic, method that can provide multiple throughput options depending on changing requirements as the pandemic peaks and troughs. Here we present CoronaHiT, a method capable of multiplexing up to 96 small genomes on a single MinION flowcell or >384 genomes on Illumina NextSeq, using transposase mediated addition of adapters and PCR based addition of barcodes to ARTIC PCR products. We demonstrate the method by sequencing 95 and 59 SARS-CoV-2 genomes for routine and rapid outbreak response runs, respectively, on Nanopore and Illumina platforms and compare to the standard ARTIC LoCost nanopore method. Of the 154 samples sequenced using the three approaches, genomes with ≥ 90% coverage (GISAID criteria) were generated for 64.3% of samples for ARTIC LoCost, 71.4% for CoronaHiT-ONT, and 76.6% for CoronaHiT-Illumina and have almost identical clustering on a maximum likelihood tree. In conclusion, we demonstrate that CoronaHiT can multiplex up to 96 SARS-CoV-2 genomes per MinION flowcell and that Illumina sequencing can be performed on the same libraries, which will allow significantly higher throughput. CoronaHiT provides increased coverage for higher Ct samples, thereby increasing the number of high quality genomes that pass the GISAID QC threshold. This protocol will aid the rapid expansion of SARS-CoV-2 genome sequencing globally, to help control the pandemic.

Article activity feed

  1. SciScore for 10.1101/2020.06.24.162156: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board Statementnot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    For CoronaHiT-ONT data, we used the subcommand samtools ampliconclip (v 1.11) at the primer trimming step (https://github.com/quadram-institute-bioscience/fieldbioinformatics/tree/coronahit).
    samtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    The consensus sequences were uploaded to GISAID and the raw sequence data was uploaded to the European Nucleotide Archive under BioProject PRJEB41737.
    BioProject
    suggested: (NCBI BioProject, RRID:SCR_004801)
    The raw reads were demultiplexed using bcl2fastq (v2.20) (Illumina Inc.) to produce 311 FASTQ files for the run with the routine samples (112 SARS-CoV-2 samples and 3 negative controls) and the run with the rapid response samples (247 SARS-CoV-2 samples, 4 negative controls, and 2 positive controls) with only the relevant samples analysed in this paper.
    bcl2fastq
    suggested: (bcl2fastq , RRID:SCR_015058)
    Briefly, the reads had adapters trimmed with TrimGalore (https://github.com/FelixKrueger/TrimGalore), were aligned to the Wuhan-Hu-1 reference genome (accession MN908947.3) using BWA-MEM (v0.7.17) (Li 2013), the ARTIC amplicons were trimmed and a consensus built using iVAR (v.1.2.3) (Grubaugh et al. 2019).
    TrimGalore
    suggested: None
    BWA-MEM
    suggested: (Sniffles, RRID:SCR_017619)
    A multiple FASTA alignment was created by aligning all samples to the reference genome MN908947.3 with MAFFT v7.470.
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    SNPs in the multiple FASTA alignment were identified using SNP-sites (v2.5.1) (Page et al. 2016) and the tree was visualised with FigTree (v1.4.4) (https://github.com/rambaut/figtree).
    FigTree
    suggested: (FigTree, RRID:SCR_008515)

    Results from OddPub: Thank you for sharing your data.


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.