Genetic grouping of SARS-CoV-2 coronavirus sequences using informative subtype markers for pandemic spread visualization

This article has been Reviewed by the following groups

Read the full article

Abstract

No abstract available

Article activity feed

  1. SciScore for 10.1101/2020.04.07.030759: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    We then performed multiple sequence alignment on all remaining sequences using MAFFT [38] with the “FFT-NS-2” method in XSEDE [39].
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    All visualizations in this paper and our pipeline are generated using Matplotlib and Plotly [41, 42].
    Matplotlib
    suggested: (MatPlotLib, RRID:SCR_008624)
    Plotly
    suggested: (Plotly, RRID:SCR_013991)

    Results from OddPub: Thank you for sharing your code and data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    All inferences drawn from observed temporal trends in subtypes based on the genome sequence dataset—whether based on ISM or phylogeny-based methods–will be limited by important caveats, including: 1) The collection date of the viral sequence is usually later than the date that the individual was actually infected by the virus. Many of those individuals will be tested after they develop symptoms, which may only begin to arise several days or even two weeks after infection according to current estimates [60]. 2) The depth of sequencing within different regions is highly variable. As an extreme case, Iceland, which has a small population, has 1.3% of all sequences in the complete data set. Italy, on the other hand, had a large and early outbreak but has disproportionately less sequencing coverage (133 sequences). Evaluating the ability of ISM-defined subtypes to track significant genetic changes during the SARS-CoV-2 pandemic: In our results section, we identified a few widespread ISM subtypes, e.g., TCCGCCAGTGG that dominates New York and some ISM subtypes that are unique to a region, e.g., CCTGCTAAGGG that is mostly found in North America. In this section, we show related literature and how their results relate to ours. We primarily use the original 20-nt ISM identifiers in this section, rather than the compressed ISM, in order to discuss all the positions identified by our entropy analysis and relate them to the literature. Subtype prevalent in New York and some European coun...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.