Multiple Introductions Followed by Ongoing Community Spread of SARS-CoV-2 at One of the Largest Metropolitan Areas of Northeast Brazil

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Multiple epicenters of the SARS-CoV-2 pandemic have emerged since the first pneumonia cases in Wuhan, China, such as Italy, USA, and Brazil. Brazil is the third-most affected country worldwide, but genomic sequences of SARS-CoV-2 strains are mostly restricted to states from the Southeast region. Pernambuco state, located in the Northeast region, is the sixth most affected Brazilian state, but very few genomic sequences from the strains circulating in this region are available. We sequenced 101 strains of SARS-CoV-2 from patients presenting Covid-19 symptoms that reside in Pernambuco. Phylogenetic reconstructions revealed that all genomes belong to the B lineage and most of the samples (88%) were classified as lineage B.1.1. We detected multiple viral introductions from abroad (likely from Europe) as well as six local B.1.1 clades composed by Pernambuco only strains. Local clades comprise sequences from the capital city (Recife) and other country-side cities, corroborating the community spread between different municipalities of the state. These findings demonstrate that different from Southeastern Brazilian states where the epidemics were majorly driven by one dominant lineage (B.1.1.28 or B.1.1.33), the early epidemic phase at the Pernambuco state was driven by multiple B.1.1 lineages seeded through both national and international traveling.

Article activity feed

  1. SciScore for 10.1101/2020.08.25.20171595: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Sequencing was performed in the MiSeq (Illumina) machine using MiSeq Reagent kit V3 of 150 cycles employing a paired-end strategy.
    MiSeq
    suggested: (A5-miseq, RRID:SCR_012148)
    Genome Assembly and Annotation: Low quality raw sequencing reads and primers sequences were removed using Trimmomatic 0.36 with default parameters.
    Trimmomatic
    suggested: (Trimmomatic, RRID:SCR_011848)
    Based on the knowledge that epidemic viruses sampled at short time frames does not accumulate a substantial amount of mutations, we performed a reference-based assembly strategy using the first published SARS-CoV-2 genome as reference (NC_045512.2) using Bowtie2 software [23] with default parameters.
    Bowtie2
    suggested: (Bowtie 2, RRID:SCR_016368)
    Following, we generated a .bed file using samtools 1.5 [24] and genomeCoverageBed from bedtools v 2.15.0 [25] keeping only position with > 5x of coverage.
    samtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    bedtools
    suggested: (BEDTools, RRID:SCR_006646)
    Lastly we used vcf-annotate (parameters --filter Qual=20/MinDP=100/SnpGap=20) and vcf-consensus from vcftools v 0.1.13 [26] to generate the final consensus sequences.
    vcftools
    suggested: (VCFtools, RRID:SCR_001235)
    These genomes and the 38 ones sequenced in our study were aligned with the reference genome NC_045512.2 using MAFFT add v7.310 [29] with the --keep-length parameter.
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    The SARS-CoV-2 lineages were assigned with pangolin (https://pangolin.cog-uk.io/) and the phylogenetic trees were reconstructed with IQ-TREE [33] using 1000 replicates of the bootstrap ultrafast method [34].
    IQ-TREE
    suggested: (IQ-TREE, RRID:SCR_017254)
    After the ML reconstruction with IQ-TREE the tree was evaluated in Tempest 1.5.3 [37] to check the root-to-tip temporal signal.
    Tempest
    suggested: (TempEst, RRID:SCR_017304)
    Outlier sequences were removed before the phylodynamics analysis performed in BEAST 1.10.4 [38].
    BEAST
    suggested: (BEAST, RRID:SCR_010228)
    The time-scaled trees were visualized on Figtree 1.4.4.
    Figtree
    suggested: (FigTree, RRID:SCR_008515)
    Owing to the importance of Spike protein in SARS-CoV-2 biology, Single Amino acid Polymorphisms (SAPs) and regions were deletions were found in other SARS-CoV-2 genomes [42] were carefully analyzed using Aliview and karyoploteR (http://bioconductor.org/packages/release/bioc/html/karvoploteR.html).
    Aliview
    suggested: (AliView, RRID:SCR_002780)
    Plots were performed using the ggplot2 package of the R statistical language (https://www.r-proiect.org/).
    ggplot2
    suggested: (ggplot2, RRID:SCR_014601)

    Results from OddPub: Thank you for sharing your code.


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.