A random priming amplification method for whole genome sequencing of SARS-CoV-2 and H1N1 influenza A virus

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Background

Non-targeted whole genome sequencing is a powerful tool to comprehensively identify constituents of microbial communities in a sample. There is no need to direct the analysis to any identification before sequencing which can decrease the introduction of bias and false negatives results. It also allows the assessment of genetic aberrations in the genome (e.g., single nucleotide variants, deletions, insertions and copy number variants) including in noncoding protein regions.

Methods

The performance of four different random priming amplification methods to recover RNA viral genetic material of SARS-CoV-2 were compared in this study. In method 1 (H-P) the reverse transcriptase (RT) step was performed with random hexamers whereas in methods 2-4 RT incorporating an octamer primer with a known tag. In methods 1 and 2 (K-P) sequencing was applied on material derived from the RT-PCR step, whereas in methods 3 (SISPA) and 4 (S-P) an additional amplification was incorporated before sequencing.

Results

The SISPA method was the most effective and efficient method for non-targeted/random priming whole genome sequencing of COVID that we tested. The SISPA method described in this study allowed for whole genome assembly of SARS-CoV-2 and influenza A(H1N1)pdm09 in mixed samples. We determined the limit of detection and characterization of SARS-CoV-2 virus which was 10 3 pfu/ml (Ct, 22.4) for whole genome assembly and 10 1 pfu/ml (Ct, 30) for metagenomics detection.

Conclusions

The SISPA method is predominantly useful for obtaining genome sequences from RNA viruses or investigating complex clinical samples as no prior sequence information is needed. It might be applied to monitor genomic virus changes, virus evolution and can be used for fast metagenomics detection or to assess the general picture of different pathogens within the sample.

Article activity feed

  1. SciScore for 10.1101/2021.06.25.449750: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Ethicsnot detected.
    Sex as a biological variablenot detected.
    RandomizationIf a contig was larger than 150 bases (i.e., the average size of read) a random 100 bp segment of that contig was sampled.
    Blindingnot detected.
    Power Analysisnot detected.
    Cell Line Authenticationnot detected.

    Table 2: Resources

    Experimental Models: Cell Lines
    SentencesResources
    SARS-CoV-2, hCov-19/England/2/2020 was propagated in Vero E6 cells (ATCC® CRL-1586™) and Edinburgh SARS-CoV-2 isolates were propagated in Caco-2 cells (ATCC® HTB-37™).
    Caco-2
    suggested: None
    Infectious virus quantification: Using plaque assay, SARS-CoV-2 was quantified in Vero E6 cells and A(H1N1)pdm09 (IAV) in MDCK cells and expressed as plaque forming units (pfu)/ ml.
    MDCK
    suggested: CLS Cat# 602280/p823_MDCK_(NBL-2, RRID:CVCL_0422)
    For SARS-CoV-2, Vero E6 cells were inoculated with 10-fold dilutions of SARS-CoV-2 for 1 h and overlaid with 0.8% (w/v) Avicel medium (1 x MEM Temin’s modification (Gibco), 0.8% (w/v) Avicel® Microcrystalline Cellulose and Sodium Carboxymethylcellulose (FMC BioPolymer), 2% FBS (v/v) (Gibco).
    Vero E6
    suggested: None
    Software and Algorithms
    SentencesResources
    Sequence analysis: The quality of sequencing reads was assessed using FastQC ver.
    FastQC
    suggested: (FastQC, RRID:SCR_014583)
    The reads were quality trimmed with using a quality score of 30 or more, in addition to low-quality ends trimming and adapter removal using Trim Galore ver.
    Trim Galore
    suggested: (Trim Galore, RRID:SCR_011847)
    Resulting contigs were quality assessed using QUAST (version 5.0.2) [35, 36].
    QUAST
    suggested: (QUAST, RRID:SCR_001228)
    Consensus sequences were re-called based on BWA-MEM mapping of trimmed (but un-normalized) read data to the genome scaffold and parsing of the mpileup alignment.
    BWA-MEM
    suggested: (Sniffles, RRID:SCR_017619)
    0.7.17 [38] and Geneious 9.1.2 (https://www.geneious.com).
    Geneious
    suggested: (Geneious, RRID:SCR_010519)
    Cleaned datasets were mapped against the reference followed by variant calling with LoFreq ver 3.0 [39] to identify the presence of variants arising from inter- or intra-population quasispecies at 3% frequency.
    LoFreq
    suggested: (LoFreq, RRID:SCR_013054)
    Metagenomics detection: Three independent methods were used to detect the presence of the viruses in the samples (Figure 5). (1) Assembly: The first method used the contigs assembled by SPAdes assembler using inhouse pipeline.
    SPAdes
    suggested: (SPAdes, RRID:SCR_000131)
    If any of the sampled reads mapped to a virus, its top ten hits were examined, and the contig it was derived from was aligned to the nt-database with BLAST (allowing a maximum of 10 hits per contig).
    BLAST
    suggested: (BLASTX, RRID:SCR_001653)
    Each read was inspected using Kraken and its minikraken database to build a report containing the possible organisms the sequences originated from and the number of reads supporting their presence.
    Kraken
    suggested: (Kraken, RRID:SCR_005484)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.