Phylogenomics reveals multiple introductions and early spread of SARS-CoV-2 into Peru

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Peru has become one of the countries with the highest mortality rate from the current severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic. To investigate early transmission event and genomic diversity of SARS-CoV-2 isolates circulating in Peru, we analyzed a total of 3472 SARS-CoV-2 genomes, from which 149 ones were from Peru. Phylogenomic analysis revealed multiple and independent introductions of the virus mainly from Europe and Asia. In addition, we found evidence for community-driven transmission of SARS-CoV-2 as suggested by clusters of related viruses found in patients living in different Peru regions.

Article activity feed

  1. SciScore for 10.1101/2020.09.14.296814: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Filtered reads were mapped against SARS‐CoV‐2 reference (NC_045512) using Burrows‐ Wheeler Aligner MEM algorithm BWA‐MEM v0.7.7 (arXiv:1303.3997v2).
    BWA‐MEM
    suggested: (Sniffles, RRID:SCR_017619)
    SAMtools and Geneious Prime were used to sort BAM files, to generate alignment statistics and to obtain consensus sequence.
    SAMtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    Phylogenetic analysis of SARS-CoV in Peru: The full genomic dataset (n=3472) was aligned using MAFFT v 7.1 (10.1093/nar/gkf436) with default parameters.
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    We estimate a maximum likelihood tree of 3472 aligned sequences using IQ-tree v 1.6 (17) under a HKY nucleotide substitution model, with gamma distribution of among site rate variation (HKY+G+I) as selected by ModelFinder (18) and using the EPI_ISL_406801 sequence (GISAID) to root the tree.
    IQ-tree
    suggested: (IQ-TREE, RRID:SCR_017254)
    TempEst v1.5.1 (19) was used to assess the strength of temporal signal and inspect for outliers in the dataset by a root-to-tip regression of genetic distance against sampling date.
    TempEst
    suggested: (TempEst, RRID:SCR_017304)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    One of the limitations of this study are the differences between the number of Peruvian isolates from cases identified within Peru regions. Specifically, there were more SARS-CoV-2 representatives’ sequences in Lima than in any Peru departments. Thus, it means that our estimates might be biased due to them being based on available background sequences at that time. Moreover, in the absence of epidemiological information such as travel history and contacts tracking, it is hard to associate periods of untracked transmissions with any specific regions or countries.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.