Phylogenomics reveals multiple introductions and early spread of SARS‐CoV‐2 into Peru

This article has been Reviewed by the following groups

Read the full article

Abstract

Peru has become one of the countries with the highest mortality rates from the current coronavirus disease 2019 (COVID‐19) pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2). To investigate early transmission events and the genomic diversity of SARS‐CoV‐2 isolates circulating in Peru in the early COVID‐19 pandemic, we analyzed 3472 viral genomes, of which 149 were from Peru. Phylogenomic analysis revealed multiple and independent introductions of the virus likely from Europe and Asia and a high diversity of genetic lineages circulating in Peru. In addition, we found evidence for community‐driven transmission of SARS‐CoV‐2 as suggested by clusters of related viruses found in patients living in different regions of Peru.

Article activity feed

  1. SciScore for 10.1101/2020.09.14.296814: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Filtered reads were mapped against SARS‐CoV‐2 reference (NC_045512) using Burrows‐ Wheeler Aligner MEM algorithm BWA‐MEM v0.7.7 (arXiv:1303.3997v2).
    BWA‐MEM
    suggested: (Sniffles, RRID:SCR_017619)
    SAMtools and Geneious Prime were used to sort BAM files, to generate alignment statistics and to obtain consensus sequence.
    SAMtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    Phylogenetic analysis of SARS-CoV in Peru: The full genomic dataset (n=3472) was aligned using MAFFT v 7.1 (10.1093/nar/gkf436) with default parameters.
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    We estimate a maximum likelihood tree of 3472 aligned sequences using IQ-tree v 1.6 (17) under a HKY nucleotide substitution model, with gamma distribution of among site rate variation (HKY+G+I) as selected by ModelFinder (18) and using the EPI_ISL_406801 sequence (GISAID) to root the tree.
    IQ-tree
    suggested: (IQ-TREE, RRID:SCR_017254)
    TempEst v1.5.1 (19) was used to assess the strength of temporal signal and inspect for outliers in the dataset by a root-to-tip regression of genetic distance against sampling date.
    TempEst
    suggested: (TempEst, RRID:SCR_017304)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    One of the limitations of this study are the differences between the number of Peruvian isolates from cases identified within Peru regions. Specifically, there were more SARS-CoV-2 representatives’ sequences in Lima than in any Peru departments. Thus, it means that our estimates might be biased due to them being based on available background sequences at that time. Moreover, in the absence of epidemiological information such as travel history and contacts tracking, it is hard to associate periods of untracked transmissions with any specific regions or countries.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.