A random priming amplification method for whole genome sequencing of SARS-CoV-2 and H1N1 influenza A virus

Abstract

Background

Non-targeted whole genome sequencing is a powerful tool to comprehensively identify constituents of microbial communities in a sample. There is no need to direct the analysis to any identification before sequencing which can decrease the introduction of bias and false negatives results. It also allows the assessment of genetic aberrations in the genome (e.g., single nucleotide variants, deletions, insertions and copy number variants) including in noncoding protein regions.

Methods

The performance of four different random priming amplification methods to recover RNA viral genetic material of SARS-CoV-2 were compared in this study. In method 1 (H-P) the reverse transcriptase (RT) step was performed with random hexamers whereas in methods 2-4 RT incorporating an octamer primer with a known tag. In methods 1 and 2 (K-P) sequencing was applied on material derived from the RT-PCR step, whereas in methods 3 (SISPA) and 4 (S-P) an additional amplification was incorporated before sequencing.

Results

The SISPA method was the most effective and efficient method for non-targeted/random priming whole genome sequencing of COVID that we tested. The SISPA method described in this study allowed for whole genome assembly of SARS-CoV-2 and influenza A(H1N1)pdm09 in mixed samples. We determined the limit of detection and characterization of SARS-CoV-2 virus which was 10 ³ pfu/ml (Ct, 22.4) for whole genome assembly and 10 ¹ pfu/ml (Ct, 30) for metagenomics detection.

Conclusions

The SISPA method is predominantly useful for obtaining genome sequences from RNA viruses or investigating complex clinical samples as no prior sequence information is needed. It might be applied to monitor genomic virus changes, virus evolution and can be used for fast metagenomics detection or to assess the general picture of different pathogens within the sample.

SciScore for 10.1101/2021.06.25.449750: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Ethics	not detected.
Sex as a biological variable	not detected.
Randomization	If a contig was larger than 150 bases (i.e., the average size of read) a random 100 bp segment of that contig was sampled.
Blinding	not detected.
Power Analysis	not detected.
Cell Line Authentication	not detected.

Table 2: Resources

Experimental Models: Cell Lines
Sentences	Resources
SARS-CoV-2, hCov-19/England/2/2020 was propagated in Vero E6 cells (ATCC® CRL-1586™) and Edinburgh SARS-CoV-2 isolates were propagated in Caco-2 cells (ATCC® HTB-37™).	Caco-2 suggested: None
Infectious virus quantification: Using plaque assay, SARS-CoV-2 was quantified in Vero E6 cells and A(H1N1)pdm09 (IAV) in MDCK cells and expressed as plaque forming units …

SciScore for 10.1101/2021.06.25.449750: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Ethics	not detected.
Sex as a biological variable	not detected.
Randomization	If a contig was larger than 150 bases (i.e., the average size of read) a random 100 bp segment of that contig was sampled.
Blinding	not detected.
Power Analysis	not detected.
Cell Line Authentication	not detected.

Table 2: Resources

Experimental Models: Cell Lines
Sentences	Resources
SARS-CoV-2, hCov-19/England/2/2020 was propagated in Vero E6 cells (ATCC® CRL-1586™) and Edinburgh SARS-CoV-2 isolates were propagated in Caco-2 cells (ATCC® HTB-37™).	Caco-2 suggested: None
Infectious virus quantification: Using plaque assay, SARS-CoV-2 was quantified in Vero E6 cells and A(H1N1)pdm09 (IAV) in MDCK cells and expressed as plaque forming units (pfu)/ ml.	MDCK suggested: CLS Cat# 602280/p823_MDCK_(NBL-2, RRID:CVCL_0422)
For SARS-CoV-2, Vero E6 cells were inoculated with 10-fold dilutions of SARS-CoV-2 for 1 h and overlaid with 0.8% (w/v) Avicel medium (1 x MEM Temin’s modification (Gibco), 0.8% (w/v) Avicel® Microcrystalline Cellulose and Sodium Carboxymethylcellulose (FMC BioPolymer), 2% FBS (v/v) (Gibco).	Vero E6 suggested: None
Software and Algorithms
Sentences	Resources
Sequence analysis: The quality of sequencing reads was assessed using FastQC ver.	FastQC suggested: (FastQC, RRID:SCR_014583)
The reads were quality trimmed with using a quality score of 30 or more, in addition to low-quality ends trimming and adapter removal using Trim Galore ver.	Trim Galore suggested: (Trim Galore, RRID:SCR_011847)
Resulting contigs were quality assessed using QUAST (version 5.0.2) [35, 36].	QUAST suggested: (QUAST, RRID:SCR_001228)
Consensus sequences were re-called based on BWA-MEM mapping of trimmed (but un-normalized) read data to the genome scaffold and parsing of the mpileup alignment.	BWA-MEM suggested: (Sniffles, RRID:SCR_017619)
0.7.17 [38] and Geneious 9.1.2 (https://www.geneious.com).	Geneious suggested: (Geneious, RRID:SCR_010519)
Cleaned datasets were mapped against the reference followed by variant calling with LoFreq ver 3.0 [39] to identify the presence of variants arising from inter- or intra-population quasispecies at 3% frequency.	LoFreq suggested: (LoFreq, RRID:SCR_013054)
Metagenomics detection: Three independent methods were used to detect the presence of the viruses in the samples (Figure 5). (1) Assembly: The first method used the contigs assembled by SPAdes assembler using inhouse pipeline.	SPAdes suggested: (SPAdes, RRID:SCR_000131)
If any of the sampled reads mapped to a virus, its top ten hits were examined, and the contig it was derived from was aligned to the nt-database with BLAST (allowing a maximum of 10 hits per contig).	BLAST suggested: (BLASTX, RRID:SCR_001653)
Each read was inspected using Kraken and its minikraken database to build a report containing the possible organisms the sequences originated from and the number of reads supporting their presence.	Kraken suggested: (Kraken, RRID:SCR_005484)

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Results from scite Reference Check: We found no unreliable references.

Read the original source

A random priming amplification method for whole genome sequencing of SARS-CoV-2 and H1N1 influenza A virus

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Background

Methods

Results

Conclusions

Article activity feed