Multiple Introductions Followed by Ongoing Community Spread of SARS-CoV-2 at One of the Largest Metropolitan Areas of Northeast Brazil

Abstract

Multiple epicenters of the SARS-CoV-2 pandemic have emerged since the first pneumonia cases in Wuhan, China, such as Italy, USA, and Brazil. Brazil is the third-most affected country worldwide, but genomic sequences of SARS-CoV-2 strains are mostly restricted to states from the Southeast region. Pernambuco state, located in the Northeast region, is the sixth most affected Brazilian state, but very few genomic sequences from the strains circulating in this region are available. We sequenced 101 strains of SARS-CoV-2 from patients presenting Covid-19 symptoms that reside in Pernambuco. Phylogenetic reconstructions revealed that all genomes belong to the B lineage and most of the samples (88%) were classified as lineage B.1.1. We detected multiple viral introductions from abroad (likely from Europe) as well as six local B.1.1 clades composed by Pernambuco only strains. Local clades comprise sequences from the capital city (Recife) and other country-side cities, corroborating the community spread between different municipalities of the state. These findings demonstrate that different from Southeastern Brazilian states where the epidemics were majorly driven by one dominant lineage (B.1.1.28 or B.1.1.33), the early epidemic phase at the Pernambuco state was driven by multiple B.1.1 lineages seeded through both national and international traveling.

SciScore for 10.1101/2020.08.25.20171595: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Sequencing was performed in the MiSeq (Illumina) machine using MiSeq Reagent kit V3 of 150 cycles employing a paired-end strategy.	MiSeq suggested: (A5-miseq, RRID:SCR_012148)
Genome Assembly and Annotation: Low quality raw sequencing reads and primers sequences were removed using Trimmomatic 0.36 with default parameters.	Trimmomatic suggested: (Trimmomatic, RRID:SCR_011848)
Based on the knowledge that epidemic viruses sampled at short time frames does not accumulate a substantial amount of mutations, we performed a reference-based assembly strategy using the first published SARS-CoV-2 genome …

SciScore for 10.1101/2020.08.25.20171595: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Sequencing was performed in the MiSeq (Illumina) machine using MiSeq Reagent kit V3 of 150 cycles employing a paired-end strategy.	MiSeq suggested: (A5-miseq, RRID:SCR_012148)
Genome Assembly and Annotation: Low quality raw sequencing reads and primers sequences were removed using Trimmomatic 0.36 with default parameters.	Trimmomatic suggested: (Trimmomatic, RRID:SCR_011848)
Based on the knowledge that epidemic viruses sampled at short time frames does not accumulate a substantial amount of mutations, we performed a reference-based assembly strategy using the first published SARS-CoV-2 genome as reference (NC_045512.2) using Bowtie2 software [23] with default parameters.	Bowtie2 suggested: (Bowtie 2, RRID:SCR_016368)
Following, we generated a .bed file using samtools 1.5 [24] and genomeCoverageBed from bedtools v 2.15.0 [25] keeping only position with > 5x of coverage.	samtools suggested: (SAMTOOLS, RRID:SCR_002105) bedtools suggested: (BEDTools, RRID:SCR_006646)
Lastly we used vcf-annotate (parameters --filter Qual=20/MinDP=100/SnpGap=20) and vcf-consensus from vcftools v 0.1.13 [26] to generate the final consensus sequences.	vcftools suggested: (VCFtools, RRID:SCR_001235)
These genomes and the 38 ones sequenced in our study were aligned with the reference genome NC_045512.2 using MAFFT add v7.310 [29] with the --keep-length parameter.	MAFFT suggested: (MAFFT, RRID:SCR_011811)
The SARS-CoV-2 lineages were assigned with pangolin (https://pangolin.cog-uk.io/) and the phylogenetic trees were reconstructed with IQ-TREE [33] using 1000 replicates of the bootstrap ultrafast method [34].	IQ-TREE suggested: (IQ-TREE, RRID:SCR_017254)
After the ML reconstruction with IQ-TREE the tree was evaluated in Tempest 1.5.3 [37] to check the root-to-tip temporal signal.	Tempest suggested: (TempEst, RRID:SCR_017304)
Outlier sequences were removed before the phylodynamics analysis performed in BEAST 1.10.4 [38].	BEAST suggested: (BEAST, RRID:SCR_010228)
The time-scaled trees were visualized on Figtree 1.4.4.	Figtree suggested: (FigTree, RRID:SCR_008515)
Owing to the importance of Spike protein in SARS-CoV-2 biology, Single Amino acid Polymorphisms (SAPs) and regions were deletions were found in other SARS-CoV-2 genomes [42] were carefully analyzed using Aliview and karyoploteR (http://bioconductor.org/packages/release/bioc/html/karvoploteR.html).	Aliview suggested: (AliView, RRID:SCR_002780)
Plots were performed using the ggplot2 package of the R statistical language (https://www.r-proiect.org/).	ggplot2 suggested: (ggplot2, RRID:SCR_014601)

Results from OddPub: Thank you for sharing your code.

Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Read the original source

Multiple Introductions Followed by Ongoing Community Spread of SARS-CoV-2 at One of the Largest Metropolitan Areas of Northeast Brazil

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Genomic characterization of SARS-CoV-2 variants circulating in the population of Bangui, Central African Republic (CAR) in 2022.

Overview of SARS-CoV-2 Genomic Surveillance in Central America and the Dominican Republic from February 2020 to January 2023: The Impact of PAHO and COMISCA's Collaborative Efforts

DIVERSITY AND CLINICAL CORRELATIONS OF SARS-CoV-2 VARIANT DURING THE INTRODUCTION OF THE DELTA VARIANT IN GUATEMALA

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Genomic characterization of SARS-CoV-2 variants circulating in the population of Bangui, Central African Republic (CAR) in 2022.

Overview of SARS-CoV-2 Genomic Surveillance in Central America and the Dominican Republic from February 2020 to January 2023: The Impact of PAHO and COMISCA's Collaborative Efforts

DIVERSITY AND CLINICAL CORRELATIONS OF SARS-CoV-2 VARIANT DURING THE INTRODUCTION OF THE DELTA VARIANT IN GUATEMALA