Assessment of Inter-Laboratory Differences in SARS-CoV-2 Consensus Genome Assemblies between Public Health Laboratories in Australia

Charles S. P. Foster
Sacha Stelzer-Braid
Ira W. Deveson
Rowena A. Bull
Malinna Yeang
Jane-Phan Au
Mariana Ruiz Silva
Sebastiaan J. van Hal
Rebecca J. Rockett
Vitali Sintchenko
Ki Wook Kim
William D. Rawlinson

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (ScreenIT)

Abstract

Whole-genome sequencing of viral isolates is critical for informing transmission patterns and for the ongoing evolution of pathogens, especially during a pandemic. However, when genomes have low variability in the early stages of a pandemic, the impact of technical and/or sequencing errors increases. We quantitatively assessed inter-laboratory differences in consensus genome assemblies of 72 matched SARS-CoV-2-positive specimens sequenced at different laboratories in Sydney, Australia. Raw sequence data were assembled using two different bioinformatics pipelines in parallel, and resulting consensus genomes were compared to detect laboratory-specific differences. Matched genome sequences were predominantly concordant, with a median pairwise identity of 99.997%. Identified differences were predominantly driven by ambiguous site content. Ignoring these produced differences in only 2.3% (5/216) of pairwise comparisons, each differing by a single nucleotide. Matched samples were assigned the same Pango lineage in 98.2% (212/216) of pairwise comparisons, and were mostly assigned to the same phylogenetic clade. However, epidemiological inference based only on single nucleotide variant distances may lead to significant differences in the number of defined clusters if variant allele frequency thresholds for consensus genome generation differ between laboratories. These results underscore the need for a unified, best-practices approach to bioinformatics between laboratories working on a common outbreak problem.

Version published to 10.3390/v14020185
Jan 19, 2022

SciScore for 10.1101/2021.08.19.21262296: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Each SARS-CoV-2-positive extract is also sent to the Institute of Clinical Pathology and Medical Research (ICPMR), NSW Health Pathology-West, NSW, Australia, for WGS according to their established protocols (8).	WGS suggested: None
Library preparation was carried out using an Illumina Nextera XT Kit, followed by sequencing on an Illumina iSeq or MiniSeq (150 cycles).	MiniSeq suggested: None
Clean reads were then mapped to the NCBI RefSeq assembly of SARS-CoV-2 (NC_045512.2) using bwa mem v0.7.17-r1188 (26), with unmapped reads discarded, and primer sequences were soft-clipped from the …

SciScore for 10.1101/2021.08.19.21262296: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Each SARS-CoV-2-positive extract is also sent to the Institute of Clinical Pathology and Medical Research (ICPMR), NSW Health Pathology-West, NSW, Australia, for WGS according to their established protocols (8).	WGS suggested: None
Library preparation was carried out using an Illumina Nextera XT Kit, followed by sequencing on an Illumina iSeq or MiniSeq (150 cycles).	MiniSeq suggested: None
Clean reads were then mapped to the NCBI RefSeq assembly of SARS-CoV-2 (NC_045512.2) using bwa mem v0.7.17-r1188 (26), with unmapped reads discarded, and primer sequences were soft-clipped from the alignment using ivar trim v.	RefSeq suggested: (RefSeq, RRID:SCR_003496)
Alignments were converted to pileup format using samtools mpileup v1.10 (27) without discarding anomalous read pairs (-A), per-base alignment quality disabled (-B), and no minimum PHRED quality for bases (-Q 0).	samtools suggested: (SAMTOOLS, RRID:SCR_002105)
Demultiplexed raw sequencing data from Lab2 were quality trimmed using Trimmomatic (v0.36, sliding window of 4, minimum read quality score of 20, leading/trailing quality of 5) (29).	Trimmomatic suggested: (Trimmomatic, RRID:SCR_011848)

Results from OddPub: Thank you for sharing your code and data.

Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Results from scite Reference Check: We found no unreliable references.

Read the original source

Version published to 10.1101/2021.08.19.21262296 on medRxiv
Aug 22, 2021

DIVERSITY AND CLINICAL CORRELATIONS OF SARS-CoV-2 VARIANT DURING THE INTRODUCTION OF THE DELTA VARIANT IN GUATEMALA

This article has 13 authors:
1. Claudia Carranza
2. Lucia Ortiz
3. Maria Eugenia Castellanos
4. Ana Silvia Gonzalez-Reiche
5. Renata Mendizabal-Cabrera
6. Zain Khalil
7. Adriana van DeGuchte
8. Keith Farrugia
9. Mariana Herrera
10. Ernesto Mena
11. Celia Cordon-Rosales
12. Harm van Bakel
13. Daniel R. Perez
Reviewed by Access Microbiology

This article has 3 evaluationsLatest version Feb 3, 2026Latest activity Jul 20, 2025
Rapid Phylogenomic Analysis of Thousands Outbreak‐Causing Viral Genomes Using Covary

This article has 1 author:
1. Marvin I. De los Santos
This article has no evaluationsLatest version Dec 22, 2025
Overview of SARS-CoV-2 Genomic Surveillance in Central America and the Dominican Republic from February 2020 to January 2023: The Impact of PAHO and COMISCA's Collaborative Efforts

This article has 31 authors:
1. Sofia Herrera Agüero
2. Aldo Sosa
3. Alexander Martínez
4. Ambar Moreno
5. César Roberto Conde Pereira
6. Claudia Gonzalez
7. Claudio Soto Garita
8. Daniel Ulate
9. Estela Cordero-Laurent
10. Hebleen Brenes
11. Isaac Miguel Sánchez
12. Jairo Mendez-Rico
13. Jessica Góndola
14. Jose Arturo Molina-Mora
15. Juliana Leite
16. Leticia Franco
17. Linda Mendoza
18. Lionel Gresh
19. Lucia De La Cruz
20. Mitzi Castro Paz
21. Monica Barahona
22. Naomi Iihoshi
23. Oris Chavarria
24. Priscila Born
25. Ruby Melany Aguillón
26. Ruth Carolina Vasquez Cordova
27. Selene Gonzalez
28. Sofia Carolina Alvarado Silva
29. Xochitl Sandoval López
30. Yvonne Imbert
31. Francisco Duarte-Martínez
This article has no evaluationsLatest version Jan 14, 2026

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

DIVERSITY AND CLINICAL CORRELATIONS OF SARS-CoV-2 VARIANT DURING THE INTRODUCTION OF THE DELTA VARIANT IN GUATEMALA

Rapid Phylogenomic Analysis of Thousands Outbreak‐Causing Viral Genomes Using Covary

Overview of SARS-CoV-2 Genomic Surveillance in Central America and the Dominican Republic from February 2020 to January 2023: The Impact of PAHO and COMISCA's Collaborative Efforts