Guidelines for accurate genotyping of SARS-CoV-2 using amplicon-based sequencing of clinical samples
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (ScreenIT)
Abstract
Background
SARS-CoV-2 genotyping has been instrumental to monitor virus evolution and transmission during the pandemic. The reliability of the information extracted from the genotyping efforts depends on a number of aspects, including the quality of the input material, applied technology and potential laboratory-specific biases. These variables must be monitored to ensure genotype reliability. The current lack of guidelines for SARS-CoV-2 genotyping leads to inclusion of error-containing genome sequences in studies of viral spread and evolution.
Results
We used clinical samples and synthetic viral genomes to evaluate the impact of experimental factors, including viral load and sequencing depth, on correct sequence determination using an amplicon-based approach. We found that at least 1000 viral genomes are necessary to confidently detect variants in the genome at frequencies of 10% or higher. The broad applicability of our recommendations was validated in >200 clinical samples from six independent laboratories. The genotypes of clinical isolates with viral load above the recommended threshold cluster by sampling location and period. Our analysis also supports the rise in frequency of 20A.EU1 and 20A.EU2, two recently reported European strains whose dissemination was favoured by travelling during the summer 2020.
Conclusions
We present much-needed recommendations for reliable determination of SARS-CoV-2 genome sequence and demonstrate their broad applicability in a large cohort of clinical samples.
Article activity feed
-
SciScore for 10.1101/2020.12.01.405738: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Mapping to human transcriptome and genome were performed using STAR (ver 2.7) (41) and bwa mem (version 0.7.12-r1039) (42), respectively. STARsuggested: (STAR, RRID:SCR_015899)Read coverage was estimated using bedtools coverage from the bedtools suite (version v2.29.2) (43) for the portion of the viral genome targeted by the amplicons (positions 36-29,844 of the reference genome). bedtoolssuggested: (BEDTools, RRID:SCR_006646)Read subsampling was performed with samtools v1.9 (44). samtoolssuggested: (SAMTOOLS, RRID:SCR_002105)Statistical analysis: Statistical data analyses were performed … SciScore for 10.1101/2020.12.01.405738: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Mapping to human transcriptome and genome were performed using STAR (ver 2.7) (41) and bwa mem (version 0.7.12-r1039) (42), respectively. STARsuggested: (STAR, RRID:SCR_015899)Read coverage was estimated using bedtools coverage from the bedtools suite (version v2.29.2) (43) for the portion of the viral genome targeted by the amplicons (positions 36-29,844 of the reference genome). bedtoolssuggested: (BEDTools, RRID:SCR_006646)Read subsampling was performed with samtools v1.9 (44). samtoolssuggested: (SAMTOOLS, RRID:SCR_002105)Statistical analysis: Statistical data analyses were performed using the R software environment for statistical computing (46) and graphics with the ggplot2 package (47). ggplot2suggested: (ggplot2, RRID:SCR_014601)The datasets generated with synthetic SARS-CoV-2 genome and analysed during the current study are available in the Sequence Read Archive (SRA) repository under accession number SUB8654793. Sequence Read Archivesuggested: (DDBJ Sequence Read Archive, RRID:SCR_001370)Results from OddPub: Thank you for sharing your data.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- No funding statement was detected.
- No protocol registration statement was detected.
-
