Genome Detective Coronavirus Typing Tool for rapid identification and characterization of novel coronavirus genomes

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Summary

Genome detective is a web-based, user-friendly software application to quickly and accurately assemble all known virus genomes from next-generation sequencing datasets. This application allows the identification of phylogenetic clusters and genotypes from assembled genomes in FASTA format. Since its release in 2019, we have produced a number of typing tools for emergent viruses that have caused large outbreaks, such as Zika and Yellow Fever Virus in Brazil. Here, we present the Genome Detective Coronavirus Typing Tool that can accurately identify the novel severe acute respiratory syndrome (SARS)-related coronavirus (SARS-CoV-2) sequences isolated in China and around the world. The tool can accept up to 2000 sequences per submission and the analysis of a new whole-genome sequence will take approximately 1 min. The tool has been tested and validated with hundreds of whole genomes from 10 coronavirus species, and correctly classified all of the SARS-related coronavirus (SARSr-CoV) and all of the available public data for SARS-CoV-2. The tool also allows tracking of new viral mutations as the outbreak expands globally, which may help to accelerate the development of novel diagnostics, drugs and vaccines to stop the COVID-19 disease.

Availability and implementation

https://www.genomedetective.com/app/typingtool/cov

Contact

koen@emweb.be or deoliveira@ukzn.ac.za

Supplementary information

Supplementary data are available at Bioinformatics online.

Article activity feed

  1. SciScore for 10.1101/2020.01.31.928796: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    A reference dataset of previously published coronavirus whole genome sequences (WGS) was compiled from the Virus Pathogen Resource (VIPR) database (www.viprbrc.org).
    WGS
    suggested: None
    VIPR
    suggested: (vipR, RRID:SCR_010685)
    The 431 reference WGS were aligned with MUSCLE (Edgar 2004).
    MUSCLE
    suggested: (MUSCLE, RRID:SCR_011812)
    A Maximum likelihood phylogenetic tree, 1000 bootstrap replicates was constructed in PhyML (Guidon & Gascuel 2003; Lemoine et al., 2018) and a Bayesian tree using MrBayes (Ronquist & Huelsenbeck 2003) were constructed.
    PhyML
    suggested: (PhyML, RRID:SCR_014629)
    MrBayes
    suggested: (MrBayes, RRID:SCR_012067)
    The first classification analysis subjects a query sequence to BLAST and AGA analysi.
    BLAST
    suggested: (BLASTX, RRID:SCR_001653)
    AGA is a novel alignment method for nucleic acid sequences against annotated genomes from NCBI RefSeq Virus Database. AGA (Deforche 2017) expands the optimal alignment algorithms of Smith-Waterman (Smith & Waterman 1981) and Gotoh (Gotoh 1982) based on an induction state with additional parameters.
    RefSeq
    suggested: (RefSeq, RRID:SCR_003496)
    In the second step, a query sequence is aligned against the phylogenetic reference dataset using -add alignment option in the MAFFT software (Katoh & Standley 2013).
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)

    Results from OddPub: Thank you for sharing your data.


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • No conflict of interest statement was detected. If there are no conflicts, we encourage authors to explicit state so.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.