Genomic epidemiology of the SARS-CoV-2 epidemic in Brazil

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

The high numbers of COVID-19 cases and deaths in Brazil have made Latin America an epicentre of the pandemic. SARS-CoV-2 established sustained transmission in Brazil early in the pandemic, but important gaps remain in our understanding of virus transmission dynamics at a national scale. We use 17,135 near-complete genomes sampled from 27 Brazilian states and bordering country Paraguay. From March to November 2020, we detected co-circulation of multiple viral lineages that were linked to multiple importations (predominantly from Europe). After November 2020, we detected large, local transmission clusters within the country. In the absence of effective restriction measures, the epidemic progressed, and in January 2021 there was emergence and onward spread, both within and abroad, of variants of concern and variants under monitoring, including Gamma (P.1) and Zeta (P.2). We also characterized a genomic overview of the epidemic in Paraguay and detected evidence of importation of SARS-CoV-2 ancestor lineages and variants of concern from Brazil. Our findings show that genomic surveillance in Brazil enabled assessment of the real-time spread of emerging SARS-CoV-2 variants.

Article activity feed

  1. SciScore for 10.1101/2021.10.07.21264644: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Viral RNA was extracted from nasopharyngeal swabs using an automated protocol and tested for SARS-CoV-2 by multiplex real-time PCR assays: (i) the Allplex 2019-nCoV Assay (Seegene) targeting the envelope (E), the RNA dependent RNA polymerase (RdRp) and the nucleocapsid (N) genes; (ii) the Charité: SARS-CoV2 (E/RP) assay (Bio-Manguinhos/Fiocruz) targeting the E gene, and (iii) the GeneFinder COVID-19 Plus RealAmp Kit (Osang Healthcare, South Korea) supplied by the BrMoH, Butantan Institute and the Pan-American Health Organization (OPAS)
    GeneFinder
    suggested: (GENEFINDER, RRID:SCR_009190)
    Generation of consensus sequences from Illumina and nanopore: The genome assembly pipeline for Illumina reads involved: (i) read trimming and filtering using Trimmomatic (Bolger et al. 2014); (ii) minimap2 (Li 2018) for read mapping against the reference strain (Wuhan-hu-1 genome reference - NCBI accession NC_045512.2); (iii) samtools (Danecek et al. 2021) for sorting and indexing; (iv) Pilon (Walker et al. 2014) for improving the indel detection; (v) bwa
    Trimmomatic
    suggested: (Trimmomatic, RRID:SCR_011848)
    Pilon
    suggested: (Pilon , RRID:SCR_014731)
    mem (Li 2013) for remapping against Pilon’s generated consensus; (vi) samtools mpileup to generate alignment quality values; (vii) seqtk (Li, 2018) to generate a quasi-final genome version; (viii) bwa mem for a 3rd round of remapping reads against the quasi-final genome; and (ix) samtools depth to assess position depths given the *.bam file from the previous step (nucleotide positions with read depth < 5 are denoted as “N”).
    samtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    Consensus sequences were generated by de novo assembling using Genome Detective32 that uses DIAMOND to identify and classify candidate viral reads in broad taxonomic units, using the viral subset of the Swissprot UniRef protein database.
    DIAMOND
    suggested: (DIAMOND, RRID:SCR_009457)
    Candidate reads were next assigned to candidate reference sequences using NCBI blastn and aligned using AGA (Annotated Genome Aligner) and MAFFT.
    blastn
    suggested: (BLASTN, RRID:SCR_001598)
    Phylogenetic analysis: Sequences were aligned using MAFFT33 and submitted to IQ-TREE2 for maximum likelihood (ML) phylogenetic analysis35 employing the general time reversible (GTR) model of nucleotide substitution and a proportion of invariable sites (+I) as selected by the ModelFinder application.
    ModelFinder
    suggested: None
    Briefly, sequences from the subsampled cluster were aligned using MAFFT and preliminary ML trees were inferred in IQ-TREE2 as described above.
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    Prior to phylogeographic analysis, each lineage was also assessed for molecular clock signal using the root-to-tip regression method available in TempEst v1.5.334 following the removal of potential outliers that may violate the molecular clock assumption.
    TempEst
    suggested: (TempEst, RRID:SCR_017304)
    MCMC analyses were set up in BEAST v1.10.4, running in duplicate for 100 million interactions and sampling every 10,000 steps in the chain.
    BEAST
    suggested: (BEAST, RRID:SCR_010228)
    Convergence for each run was assessed in Tracer v1.7.1 (ESS for all relevant model parameters >200).
    Tracer
    suggested: (Tracer, RRID:SCR_019121)

    Results from OddPub: Thank you for sharing your code and data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    To help overcome these limitations we report genomic data obtained by sequencing 3,866 SARS-CoV-2 infection cases confirmed by RT-qPCR from patients residing in 8 of the 27 Brazilian federal states and one neigbouring country - Paraguay. These were analysed together with n=13,328 and n=102 available (up to 30, June 20th, 2021) publicly available complete genomes sequences from both countries, respectively. By ccombining epidemiological and genomic data, we show how the interplay between the implementation of restriction measures and sustained SARS-CoV-2 transmission have shaped the Brazilian epidemic over 20 months, including the dramatic resurgences in case numbers linked with the emergence of VOCs and VUMs. In particular, we show that multiple independent importations of SARS-CoV-2, predominantly from Europe, had occurred in Brazil during the early phase of the epidemic (up to April 2020). We further detected multiple (n=33) international introductions which likely occurred during enforcement of preventive measures, highlighting their inefficacy. We also revealed that during 2020 Brazil transitioned from a viral importer to a viral exporter, resulting in 10 times more inferred exportation events from Brazil than viral introductions into Brazil (Fig. 2E). This was likely linked to the identification of Brazilian VOC and VUM, the spread of which (both within Brazil and to other countries) likely followed patterns of population density and mobility12. We also provide the first...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.