Predominance of the SARS-CoV-2 Lineage P.1 and Its Sublineage P.1.2 in Patients from the Metropolitan Region of Porto Alegre, Southern Brazil in March 2021

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Almost a year after the COVID-19 pandemic had begun, new lineages (B.1.1.7, B.1.351, P.1, and B.1.617.2) associated with enhanced transmissibility, immunity evasion, and mortality were identified in the United Kingdom, South Africa, and Brazil. The previous most prevalent lineages in the state of Rio Grande do Sul (RS, Southern Brazil), B.1.1.28 and B.1.1.33, were rapidly replaced by P.1 and P.2, two B.1.1.28-derived lineages harboring the E484K mutation. To perform a genomic characterization from the metropolitan region of Porto Alegre, we sequenced viral samples to: (i) identify the prevalence of SARS-CoV-2 lineages in the region, the state, and bordering countries/regions; (ii) characterize the mutation spectra; (iii) hypothesize viral dispersal routes by using phylogenetic and phylogeographic approaches. We found that 96.4% of the samples belonged to the P.1 lineage and approximately 20% of them were assigned as the novel P.1.2, a P.1-derived sublineage harboring signature substitutions recently described in other Brazilian states and foreign countries. Moreover, sequences from this study were allocated in distinct branches of the P.1 phylogeny, suggesting multiple introductions in RS and placing this state as a potential diffusion core of P.1-derived clades and the emergence of P.1.2. It is uncertain whether the emergence of P.1.2 and other P.1 clades is related to clinical or epidemiological consequences. However, the clear signs of molecular diversity from the recently introduced P.1 warrant further genomic surveillance.

Article activity feed

  1. SciScore for 10.1101/2021.05.18.21257420: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    EthicsConsent: Ethics approval and consent to participate: Ethical approval was obtained from the Brazilian’s National Ethics Committee (Comissão Nacional de Ética em Pesquisa — CONEP) under process number CAAE 41909121.0.0000.5553 and Comitê de Ética em Pesquisa em Seres Humanos da Universidade Federal de Ciências da Saúde de Porto Alegre (CEP - UFCSPA) under process number CAAE 35083220.2.0000.5345.
    IRB: Ethics approval and consent to participate: Ethical approval was obtained from the Brazilian’s National Ethics Committee (Comissão Nacional de Ética em Pesquisa — CONEP) under process number CAAE 41909121.0.0000.5553 and Comitê de Ética em Pesquisa em Seres Humanos da Universidade Federal de Ciências da Saúde de Porto Alegre (CEP - UFCSPA) under process number CAAE 35083220.2.0000.5345.
    Sex as a biological variablenot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Reads were mapped to the reference SARS-CoV-2 genome (GenBank accession number NC_045512.2) using Bowtie v2.4.2 (end-to-end and very-sensitive parameters) (26).
    Bowtie
    suggested: (Bowtie, RRID:SCR_005476)
    Mapping coverage and depth were retrieved using samtools v1.11 (27) (minimum base quality per base (Q) ≥ 30).
    samtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    Mutation analysis: Single Nucleotide Polymorphisms (SNPs) and insertions/deletions in each sample were identified using snippy variant calling and core genome alignment pipeline v4.6.0 (https://github.com/tseemann/snippy), which uses FreeBayes v1.3.2 (30) to call variants and snpEff v5.0 (31) to annotate and predict their effects on genes and proteins.
    FreeBayes
    suggested: (FreeBayes, RRID:SCR_010761)
    snpEff
    suggested: (SnpEff, RRID:SCR_005191)
    These sequences were aligned using MAFFT v7.475, the ends of the alignment (300 in the beginning and 500 in the end) were masked, and the ML tree was built with IQ-TREE v2.0.3 using the GTR+F+R3 nucleotide substitution model as selected by the ModelFinder (40).
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    IQ-TREE
    suggested: (IQ-TREE, RRID:SCR_017254)
    ML trees were inspected in TempEst v1.5.3 (42) to investigate the temporal signal through regression of root-to-tip genetic divergence against sampling dates.
    TempEst
    suggested: (TempEst, RRID:SCR_017304)
    ML and time-resolved trees were visualized using FigTree v1.4.4 (http://tree.bio.ed.ac.uk/software/figtree/) and ggtree R package v2.0.4 (43).
    FigTree
    suggested: (FigTree, RRID:SCR_008515)
    Evolutionary parameter estimates and spatial diffusion were estimated separately for each clade using a Bayesian Markov Chain Monte Carlo (MCMC) approach implemented in BEAST v10.4 (45) using the BEAGLE library (46) to enhance computational time.
    BEAST
    suggested: (BEAST, RRID:SCR_010228)
    BEAGLE
    suggested: (BEAGLE, RRID:SCR_001789)
    The MCMC chains were run in duplicates for at least 50 million generations, and convergence was checked using Tracer v1.7.1 (50).
    Tracer
    suggested: (Tracer, RRID:SCR_019121)
    Log and tree files were combined using LogCombiner v1.10.4 to ensure stationarity and good mixing (43) after removing 10% as burn-in.
    LogCombiner
    suggested: (BEAST2, RRID:SCR_017307)
    Geoplotting: Geographical maps and other plots were generated using R v3.6.1 (52), and the ggplot2 v3.3.2 (53), geobr v.1.4 (54), and sf v0.9.8 (55) packages.
    ggplot2
    suggested: (ggplot2, RRID:SCR_014601)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Some limitations should be considered. Firstly, the sample size is low and not necessarily representative of the RS state. Furthermore, publicly available genomes are a result of episodic sequencing efforts, especially in Brazil. This scenario restricts more precise inferences about introductions and diffusion processes in regional and worldwide contexts since samples are not geographical and temporally well distributed. Therefore, more research and surveillance are essential to unravel a more precise genomic characterization of SARS-CoV-2 in Brazil, identifying novel variants promptly to better respond and control its spread. In summary, our study corroborates the total virtual substitution of previous lineages by P.1 in Southern Brazil in COVID-19 cases sequenced in March 2020. Moreover, we confirmed various cases caused by the novel P.1.2 sublineage and placed its origin in the State of Rio Grande do Sul. The continuous evolution of the VOC P.1 is worrisome, considering its clinical and epidemiological impact, and warrants enhanced genomic surveillance.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.