In-depth Bioinformatic Analyses of Human SARS-CoV-2, SARS-CoV, MERS-CoV, and Other Nidovirales Suggest Important Roles of Noncanonical Nucleic Acid Structures in Their Lifecycles

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Noncanonical nucleic acid structures play important roles in the regulation of molecular processes. Considering the importance of the ongoing coronavirus crisis, we decided to evaluate genomes of all coronaviruses sequenced to date (stated more broadly, the order Nidovirales ) to determine if they contain noncanonical nucleic acid structures. We discovered much evidence of putative G-quadruplex sites and even much more of inverted repeats (IRs) loci, which in fact are ubiquitous along the whole genomic sequence and indicate a possible mechanism for genomic RNA packaging. The most notable enrichment of IRs was found inside 5′UTR for IRs of size 12+ nucleotides, and the most notable enrichment of putative quadruplex sites (PQSs) was located before 3′UTR, inside 5′UTR, and before mRNA. This indicates crucial regulatory roles for both IRs and PQSs. Moreover, we found multiple G-quadruplex binding motifs in human proteins having potential for binding of SARS-CoV-2 RNA. Noncanonical nucleic acids structures in Nidovirales and in novel SARS-CoV-2 are therefore promising druggable structures that can be targeted and utilized in the future.

Article activity feed

  1. SciScore for 10.1101/2020.04.09.031252: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    All the results were merged into a single Microsoft Excel file where statistical analysis was then made.
    Microsoft Excel
    suggested: (Microsoft Excel, RRID:SCR_016137)
    Features tables of 109 Nidovirales genomes were downloaded from the NCBI database and grouped by their names as stated in the feature table file.
    NCBI
    suggested: (NCBI, RRID:SCR_006472)
    Complete analyses of IRs occurrence in Nidovirales are provided in Supplementary Material 4. 2.5 RNA Fold Predictions: In order to be able to display higher structures of the coronavirus genome, we used Galaxy’s free-online webserver (Afgan et al., 2018) and its RNA fold tool (Lorenz et al., 2011).
    Galaxy’s
    suggested: (BioBlend Library, RRID:SCR_014557)
    2.6 Multiple Alignment of SUD Domains (M Regions) in Nsp3 of Pathogenic Species: Multiple protein alignment was done using MUSCLE (Edgar, 2004) under default parameters (UGENE [Okonechnikov et al., 2012] workflow was used).
    MUSCLE
    suggested: (MUSCLE, RRID:SCR_011812)
    The output was further filtered in Excel to keep only those hits below p-value = 1.10−6.
    Excel
    suggested: None
    Supplementary Material 1: Summary of analyzed Nidovirales genomes (full names, phylogenetic groups, exact NCBI accession, and further information) Supplementary Material 2: Complete analyses of PQS occurrence in Nidovirales Supplementary Material 3: Categorization of IRs according to their overlap with a feature or feature neighborhood Supplementary Material 4: Complete analyses of IRs occurrence in Nidovirales Supplementary Material 5: RNA fold prediction for SARS-CoV2 RNA and random RNA of the same length and GC content Supplementary Material 6: Complete RBPmap results – Prediction of human RNA-binding protein sites in SARS-CoV-2 RNA Supplementary Material 7: Prediction of RGG-rich NIQI motifs in proteins identified by RBPmap
    RBPmap
    suggested: None

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.