Sequence analysis of SARS-CoV-2 genome reveals features important for vaccine design

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

As the SARS-CoV-2 pandemic is rapidly progressing, the need for the development of an effective vaccine is critical. A promising approach for vaccine development is to generate, through codon pair deoptimization, an attenuated virus. This approach carries the advantage that it only requires limited knowledge specific to the virus in question, other than its genome sequence. Therefore, it is well suited for emerging viruses, for which we may not have extensive data. We performed comprehensive in silico analyses of several features of SARS-CoV-2 genomic sequence (e.g., codon usage, codon pair usage, dinucleotide/junction dinucleotide usage, RNA structure around the frameshift region) in comparison with other members of the coronaviridae family of viruses, the overall human genome, and the transcriptome of specific human tissues such as lung, which are primarily targeted by the virus. Our analysis identified the spike (S) and nucleocapsid (N) proteins as promising targets for deoptimization and suggests a roadmap for SARS-CoV-2 vaccine development, which can be generalizable to other viruses.

Article activity feed

  1. SciScore for 10.1101/2020.03.30.016832: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    All scripts for this calculation were written in Python 3.7.4.
    Python
    suggested: (IPython, RRID:SCR_001658)
    Comparison of Codon and Codon Pair Usage in Host Species: Codon, codon pair and dinucleotide usage data for Homo sapiens, Canis lupus familiaris, Chiroptera (bats) and Pholidota (pangolins) were downloaded from the CoCoPUTs database[5] on March 13, 2020.
    CoCoPUTs
    suggested: (Codon and Codon-Pair Usage Tables, RRID:SCR_018504)
    Likewise, human lung, kidney (cortex) and small intestine (terminal ileum) tissue-specific codon, codon pair and dinucleotide usage data were accessed from the TissueCoCoPUTs database[24] on March 13, 2020.
    TissueCoCoPUTs
    suggested: None

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.