The emergence of inter-clade hybrid SARS-CoV-2 lineages revealed by 2D nucleotide variation mapping

This article has been Reviewed by the following groups

Read the full article

Abstract

I performed whole-genome sequencing on SARS-CoV-2 collected from COVID-19 samples at Mayo Clinic Rochester in mid-April, 2020, generated 85 consensus genome sequences and compared them to other genome sequences collected worldwide. I proposed a novel illustrating method using a 2D map to display populations of co-occurring nucleotide variants for intra- and inter-viral clades. This method is highly advantageous for the new era of “big-data” when high-throughput sequencing is becoming readily available. Using this method, I revealed the emergence of inter-clade hybrid SARS-CoV-2 lineages that are potentially caused by homologous genetic recombination.

Article activity feed

  1. SciScore for 10.1101/2020.10.13.338038: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board Statementnot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Libraries were sequenced on ONT’s MinION device using Type R9.4.1 flow cells (FLO-MIN106D).
    MinION
    suggested: (MinION, RRID:SCR_017985)
    The evolutionary history of SARS-CoV-2 was inferred using a Bayesian approach, implemented through the Markov chain Monte Carlo framework available in BEAST 1.10.4 31, utilizing the BEAGLE library v3 32 to increase computational performance.
    BEAGLE
    suggested: (BEAGLE, RRID:SCR_001789)
    The phylogenetic tree was then plotted using the FigTree (v1.4.4) accompany with the BEAST package.
    FigTree
    suggested: (FigTree, RRID:SCR_008515)
    BEAST
    suggested: (BEAST, RRID:SCR_010228)
    Each of the remaining 17,500 genome sequences was individually aligned to the reference SARS-CoV-2 genome (NC_045512.2) using the NEEDLE tool (Needleman-Wunsch global alignment of two sequences) from the EMBOSS package v6.6.0.0 33.
    EMBOSS
    suggested: (EMBOSS, RRID:SCR_008493)
    Custom scripts in Perl, Python, and R were used for various calculations and graphics.
    Python
    suggested: (IPython, RRID:SCR_001658)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Limitations of phylogenetic analysis: A phylogenetic tree is a diagram used to illustrate phylogeny, the history of evolution, in reference to lines of descent and relationships among broad groups of organisms. The branching patterns in a phylogenetic tree reflect how species or other groups evolved from a common ancestor. The methodology has been used increasingly in diverse areas of biological science, especially for tracing viral evolution at the early stages of the outbreak in an epidemic like COVID-19 3,21,23,24. However, there are inherent limitations to a phylogenetic tree. First, traditional methods of phylogeny estimation, such as maximum parsimony, minimum evolution, or maximum likelihood, all assume that a single evolutionary history underlies the sequences. Second, the viral network presented in the tree diagram is merely a snapshot of the early stage of the viral spreading before the phylogeny becomes obscured by subsequent migration and mutation. When the number of species in the group increases, drawing a tree diagram becomes impractical. Third, commonly used methods in phylogeny, such as the pairwise distance method, do not capture the effects of homologous recombination that imply how different parts of the sequence could have separate phylogenetic histories and are not related by a single phylogenetic tree25. Though the ability to detect recombination is limited, ignoring recombination in tree-based analysis could lead to artifacts26,27. 3. Homologous recomb...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.