Assessment of Inter-Laboratory Differences in SARS-CoV-2 Consensus Genome Assemblies between Public Health Laboratories in Australia
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (ScreenIT)
Abstract
Whole-genome sequencing of viral isolates is critical for informing transmission patterns and for the ongoing evolution of pathogens, especially during a pandemic. However, when genomes have low variability in the early stages of a pandemic, the impact of technical and/or sequencing errors increases. We quantitatively assessed inter-laboratory differences in consensus genome assemblies of 72 matched SARS-CoV-2-positive specimens sequenced at different laboratories in Sydney, Australia. Raw sequence data were assembled using two different bioinformatics pipelines in parallel, and resulting consensus genomes were compared to detect laboratory-specific differences. Matched genome sequences were predominantly concordant, with a median pairwise identity of 99.997%. Identified differences were predominantly driven by ambiguous site content. Ignoring these produced differences in only 2.3% (5/216) of pairwise comparisons, each differing by a single nucleotide. Matched samples were assigned the same Pango lineage in 98.2% (212/216) of pairwise comparisons, and were mostly assigned to the same phylogenetic clade. However, epidemiological inference based only on single nucleotide variant distances may lead to significant differences in the number of defined clusters if variant allele frequency thresholds for consensus genome generation differ between laboratories. These results underscore the need for a unified, best-practices approach to bioinformatics between laboratories working on a common outbreak problem.
Article activity feed
-
-
SciScore for 10.1101/2021.08.19.21262296: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Each SARS-CoV-2-positive extract is also sent to the Institute of Clinical Pathology and Medical Research (ICPMR), NSW Health Pathology-West, NSW, Australia, for WGS according to their established protocols (8). WGSsuggested: NoneLibrary preparation was carried out using an Illumina Nextera XT Kit, followed by sequencing on an Illumina iSeq or MiniSeq (150 cycles). MiniSeqsuggested: NoneClean reads were then mapped to the NCBI RefSeq assembly of SARS-CoV-2 (NC_045512.2) using bwa mem v0.7.17-r1188 (26), with unmapped reads discarded, and primer sequences were soft-clipped from the … SciScore for 10.1101/2021.08.19.21262296: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Each SARS-CoV-2-positive extract is also sent to the Institute of Clinical Pathology and Medical Research (ICPMR), NSW Health Pathology-West, NSW, Australia, for WGS according to their established protocols (8). WGSsuggested: NoneLibrary preparation was carried out using an Illumina Nextera XT Kit, followed by sequencing on an Illumina iSeq or MiniSeq (150 cycles). MiniSeqsuggested: NoneClean reads were then mapped to the NCBI RefSeq assembly of SARS-CoV-2 (NC_045512.2) using bwa mem v0.7.17-r1188 (26), with unmapped reads discarded, and primer sequences were soft-clipped from the alignment using ivar trim v. RefSeqsuggested: (RefSeq, RRID:SCR_003496)Alignments were converted to pileup format using samtools mpileup v1.10 (27) without discarding anomalous read pairs (-A), per-base alignment quality disabled (-B), and no minimum PHRED quality for bases (-Q 0). samtoolssuggested: (SAMTOOLS, RRID:SCR_002105)Demultiplexed raw sequencing data from Lab2 were quality trimmed using Trimmomatic (v0.36, sliding window of 4, minimum read quality score of 20, leading/trailing quality of 5) (29). Trimmomaticsuggested: (Trimmomatic, RRID:SCR_011848)Results from OddPub: Thank you for sharing your code and data.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-
