Genomic landscape of the SARS-CoV-2 pandemic in Brazil suggests an external P.1 variant origin

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Brazil was the epicenter of worldwide pandemics at the peak of its second wave. The genomic/proteomic perspective of the COVID-19 pandemic in Brazil could provide insights to understand the global pandemics behavior. In this study, we track SARS-CoV-2 molecular information in Brazil using real-time bioinformatics and data science strategies to provide a comparative and evolutive panorama of the lineages in the country. SWeeP vectors represented the Brazilian and worldwide genomic/proteomic data from Global Initiative on Sharing Avian Influenza Data (GISAID) between February 2020 and August 2021. Clusters were analyzed and compared with PANGO lineages. Hierarchical clustering provided phylogenetic and evolutionary analyses of the lineages, and we tracked the P.1 (Gamma) variant origin. The genomic diversity based on Chao's estimation allowed us to compare richness and coverage among Brazilian states and other representative countries. We found that epidemics in Brazil occurred in two moments with different genetic profiles. The P.1 lineages emerged in the second wave, which was more aggressive. We could not trace the origin of P.1 from the variants present in Brazil. Instead, we found evidence pointing to its external source and a possible recombinant event that may relate P.1 to a B.1.1.28 variant subset. We discussed the potential application of the pipeline for emerging variants detection and the PANGO terminology stability over time. The diversity analysis showed that the low coverage and unbalanced sequencing among states in Brazil could have allowed the silent entry and dissemination of P.1 and other dangerous variants. This study may help to understand the development and consequences of variants of concern (VOC) entry.

Article activity feed

  1. SciScore for 10.1101/2021.11.10.21266084: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    2.3.8 2021-04-20); b) principal analysis (release 609, PANGO v3.0.5 2021-06-04); and c) final update (release 829, PANGO v.
    PANGO
    suggested: None
    The R version of SWeeP tool, used for the proteome vectorization, is available in the Bioconductor Platform4 for R version 3.12 (25).
    Bioconductor
    suggested: (Bioconductor, RRID:SCR_006442)
    The same orthonormal base, with the SWeeP default parameters (length 600 and mask [1 1 0 1 1]), was employed to project all sequences into compacted vectors.
    SWeeP
    suggested: (SWEEP, RRID:SCR_009418)
    Cluster analysis and visualization: Brazilian proteomes were clustered using the ConsensusClusterPlus package version 1.54.0 from Bioconductor (26) and the kmedoids method (Partitioning Around Medoids, PAM), in procedures with 1,000 replicates for each cycle, testing 2-20 as the number of clusters.
    ConsensusClusterPlus
    suggested: (ConsensusClusterPlus, RRID:SCR_016954)
    The t-SNE diagrams were constructed in the Rtsne package5, with its default parameters.
    Rtsne
    suggested: (Rtsne, RRID:SCR_016342)

    Results from OddPub: Thank you for sharing your code and data.


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.