ACoRE: Accurate SARS-CoV-2 genome reconstruction for the characterization of intra-host and inter-host viral diversity in clinical samples and for the evaluation of re-infections

Version published to 10.1016/j.ygeno.2021.04.008

Jul 1, 2021

SciScore for 10.1101/2021.01.22.21250285: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Institutional Review Board Statement	IACUC: The study was approved by the competent Ethical Committee for Clinical Research of Verona and Rovigo Provinces (Prot N° 39528/2020)
Randomization	Data filtering and reference genome alignment: Full-length amplicon sequencing data were randomly downsampled using seqtk sample v1.3 (https://github.com/lh3/seqtk).
Blinding	not detected.
Power Analysis	not detected.
Sex as a biological variable	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
The amplification cycle threshold (Ct) was determined using CFX Maestro (Bio-Rad Laboratories), setting a baseline threshold at 200 relative fluorescence units (RFU).	Bio-Rad Laboratories suggested: (Bio-Rad Laboratories, RRID:SCR_008426)
Barcoded libraries were pooled at equimolar concentrations and sequenced on the MiSeq platform (Illumina, San Diego, CA, USA) with Miseq Reagent kit v2 to generate 250-bp paired-end (250PE) reads.	MiSeq suggested: (A5-miseq, RRID:SCR_012148)
All sequencing datasets were trimmed for quality and adapters were removed using Trimmomatic v0.39 [56] with the following parameters: ILLUMINACLIP:adapters_file:2:30:10 LEADING:5 TRAILING:5 SLIDINGWINDOW:4:20	Trimmomatic suggested: (Trimmomatic, RRID:SCR_011848)
Filtered reads were aligned to the SARS-CoV-2 reference genome (GenBank ID: MN908947.3) using BWA MEM v0.7.17 [57] with default parameters and the relative alignment file was converted to a binary alignment map (BAM) file using SAMtools v1.9 [58].	BWA suggested: (BWA, RRID:SCR_010910)
For the fragmented libraries, duplicate reads were identified and discarded using Picard v2.21.1 (http://broadinstitute.github.io/picard).	Picard suggested: (Picard, RRID:SCR_006525)
Coverage and genotypability statistics were calculated from the BAM files using bedtools genomecov v2.19.1 [59] and GATK CallableLoci v3.8 [60], respectively.	bedtools suggested: (BEDTools, RRID:SCR_006646) GATK suggested: (GATK, RRID:SCR_001876)
Raw genomic sequencing data were deposited in NCBI GenBank (BioProject no PRJNA690890).	BioProject suggested: (NCBI BioProject, RRID:SCR_004801)
Consensus variant calling and generation of the consensus sequence: A pileup was calculated for each position in the BAM file of each replicate using the SAMtools v1.9 mpileup option with parameters -aa -A -d 0 -Q 0.	SAMtools suggested: (SAMTOOLS, RRID:SCR_002105)
To call variants present in the consensus sequences (consensus variants), sequences were aligned to the SARS-CoV-2 reference genome using Minimap v2.17 [61] and the alignment file was converted to the BAM format using SAMtools v1.9.	Minimap suggested: None
The output file was used to detect iSNVs with VarScan mpileup2cns v2.3.9 [62] and the following parameters: --min-var-freq 0.03 --min-avg-qual 20.	VarScan suggested: (VARSCAN, RRID:SCR_006849)
We used GraphPad Prism 6.0 (GraphPad Software, San Diego, CA, USA) for all statistical analysis, with a significance threshold of p < 0.05.	GraphPad Prism suggested: (GraphPad Prism, RRID:SCR_002798) GraphPad suggested: (GraphPad Prism, RRID:SCR_002798)

Results from OddPub: Thank you for sharing your data.

Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:

Increasing the depth of sequencing has been proposed as a strategy to achieve complete genome reconstruction in low-titer samples, but this does not overcome limitations caused by missing amplicons [43]. Similarly, improvement in ARTIC primer design and compatibility (currently version 3) can also ameliorate genome coverage, but again cannot make up for missing amplicons [24,30]. We found that only a few specific amplicons were reproducibly suboptimal (64, 70 and 91) whereas most showed coverage variations limited to particular samples or replicates. We therefore merged the sequencing data from two or more replicates as a simple solution to enhance coverage and genotypability, achieving a more homogeneous representation of the viral genome and rescuing the suboptimal samples. The random amplification observed in low-titer samples most likely reflects the low sample complexity rather than poor assay sensitivity or performance. Accordingly, the sampled RNA and corresponding cDNA fragments before amplification are unlikely to represent the complete genome based on our observation that the coverage achieved by sequencing two amplification replicates (each from 5 µL of cDNA) was similar to that achieved with a single amplification starting from double the amount of cDNA (10 µL). Therefore, to optimize genome reconstruction, a single large cDNA batch should be amplified in several parallel reactions, using as much sample volume as possible to increase complexity. The multiple PCR p...

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from scite Reference Check: We found no unreliable references.

About SciScore

SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

Read the original source

ACoRE: Accurate SARS-CoV-2 genome reconstruction for the characterization of intra-host and inter-host viral diversity in clinical samples and for the evaluation of re-infections

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

DIVERSITY AND CLINICAL CORRELATIONS OF SARS-CoV-2 VARIANT DURING THE INTRODUCTION OF THE DELTA VARIANT IN GUATEMALA

Overview of SARS-CoV-2 Genomic Surveillance in Central America and the Dominican Republic from February 2020 to January 2023: The Impact of PAHO and COMISCA's Collaborative Efforts

Rapid Phylogenomic Analysis of Thousands Outbreak‐Causing Viral Genomes Using Covary