Genomic surveillance of SARS-CoV-2 reveals community transmission of a major lineage during the early pandemic phase in Brazil

This article has been Reviewed by the following groups

Read the full article

Abstract

Despite all efforts to control the COVID-19 spread, the SARS-CoV-2 reached South America within three months after its first detection in China, and Brazil became one of the hotspots of COVID-19 in the world. Several SARS-CoV-2 lineages have been identified and some local clusters have been described in this early pandemic phase in Western countries. Here we investigated the genetic diversity of SARS-CoV-2 during the early phase (late February to late April) of the epidemic in Brazil. Phylogenetic analyses revealed multiple introductions of SARS-CoV-2 in Brazil and the community transmission of a major B.1.1 lineage defined by two amino acid substitutions in the Nucleocapsid and ORF6. This SARS-CoV-2 Brazilian lineage was probably established during February 2020 and rapidly spread through the country, reaching different Brazilian regions by the middle of March 2020. Our study also supports occasional exportations of this Brazilian B.1.1 lineage to neighboring South American countries and to more distant countries before the implementation of international air travels restrictions in Brazil.

Article activity feed

  1. SciScore for 10.1101/2020.06.17.158006: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    The sequencing was performed for 12 hours using the high accuracy base calling in the MinKNOW software, however, the run was monitored by RAMPART (https://github.com/articnetwork/rampart) allowing us stop the assay after 2 hours, when ≥ 20× depth for all amplicons was achieved.
    MinKNOW
    suggested: None
    RAMPART
    suggested: (Rampart, RRID:SCR_016742)
    We used an earlier version of the workflow which used Porechop to demultiplex the reads.
    Porechop
    suggested: (Porechop, RRID:SCR_016967)
    To achieve this aim, sequences from the UK were grouped by similarity with the CD-HIT program 27 and one sequence per cluster was selected.
    CD-HIT
    suggested: (CD-HIT, RRID:SCR_007105)
    With this sampling procedure, we obtained a balanced global reference B.1.1 dataset containing 3,764 sequences that were aligned with the new B.1.1 Brazilian sequences generated in this study using MAFFT v7.467 28 and then subjected to maximum-likelihood (ML) phylogenetic analyses.
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    The ML phylogenetic tree was inferred using IQTREE v1.6.12 29, under the GTR+F+I+G4 nucleotide substitution model as selected by the ModelFinder application 30 and the branch support was assessed by the approximate likelihood-ratio test based on a Shimodaira–Hasegawa-like procedure (SH-aLRT) with 1,000 replicates.
    ModelFinder
    suggested: None
    BR dataset was inferred as explained above and the temporal signal was assessed by performing a regression analysis of the root-to-tip divergence against sampling time using TempEst 31.
    TempEst
    suggested: (TempEst, RRID:SCR_017304)
    BR lineages were jointly estimated using a Bayesian Markov Chain Monte Carlo (MCMC) approach implemented in BEAST 1.10 33, using the BEAGLE library v3 34 to improve computational time.
    BEAST
    suggested: (BEAST, RRID:SCR_010228)
    BEAGLE
    suggested: (BEAGLE, RRID:SCR_001789)
    Stationarity (constant mean and variance of trace plots) and good mixing (Effective Sample Size >200) for all parameter estimates were assessed using TRACER v1.7 38.
    TRACER
    suggested: (Tracer, RRID:SCR_019121)
    The maximum clade credibility (MCC) tree was summarized with TreeAnnotator v1.10 and visualized using the FigTree v1.4 program.
    TreeAnnotator
    suggested: (BEAST2, RRID:SCR_017307)
    FigTree
    suggested: (FigTree, RRID:SCR_008515)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Although high-quality full genomes of SARS-CoV-2 currently available contain enough information to allow reliable phylogenetic inferences, the low genetic diversity of within-country (or regional) transmission clusters imposes a serious limitation for accurate phylogeographic reconstructions 39,40. Indeed, the MC test supports a random phylogenetic clustering of B.1.1.EU/BR and B.1.1.BR strains from most locations, with exception of Brazil, Argentina and Europe (Supplementary Table 4). The B.1.1.BR sequences sampled at different Brazilian states were also highly similar or identical, making it difficult to trace with precision the origin and within-country fluxes of this viral clade during the early epidemic phase in Brazil. Another important limitation of our study is the uneven spatial and temporal sampling scheme. Most SARS-CoV-2 sequences recovered in the present study were from the Rio de Janeiro state and might thus not represent the viral diversity in other Brazilian states. More accurate reconstructions of the origin and regional spread of the clade B.1.1.BR will require a denser sampling from Brazil and neighboring South American countries, particularly during the very early phase of the epidemic. In summary, this study reveals the existence of a major SARS-CoV-2 B.1.1 lineage associated with community transmission in Brazil and widespread in a national scale. This major B.1.1 Brazilian lineage emerged in Brazil in February 2020, probably before the detection of the ...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.