Genome-wide mapping of therapeutically-relevant SARS-CoV-2 RNA structures

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

SARS-CoV-2 is a betacoronavirus with a linear single-stranded, positive-sense RNA genome of ∼30 kb, whose outbreak caused the still ongoing COVID-19 pandemic. The ability of coronaviruses to rapidly evolve, adapt, and cross species barriers makes the development of effective and durable therapeutic strategies a challenging and urgent need. As for other RNA viruses, genomic RNA structures are expected to play crucial roles in several steps of the coronavirus replication cycle. Despite this, only a handful of functionally conserved structural elements within coronavirus RNA genomes have been identified to date.

Here, we performed RNA structure probing by SHAPE-MaP to obtain a single-base resolution secondary structure map of the full SARS-CoV-2 coronavirus genome. The SHAPE-MaP probing data recapitulate the previously described coronavirus RNA elements (5′ UTR, ribosomal frameshifting element, and 3′ UTR), and reveal new structures. Secondary structure-restrained 3D modeling of highly-structured regions across the SARS-CoV-2 genome allowed for the identification of several putative druggable pockets. Furthermore, ∼8% of the identified structure elements show significant covariation among SARS-CoV-2 and other coronaviruses, hinting at their functionally-conserved role. In addition, we identify a set of persistently single-stranded regions having high sequence conservation, suitable for the development of antisense oligonucleotide therapeutics.

Collectively, our work lays the foundation for the development of innovative RNA-targeted therapeutic strategies to fight SARS-related infections.

Article activity feed

  1. SciScore for 10.1101/2020.06.15.151647: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Experimental Models: Cell Lines
    SentencesResources
    Cell culture and SARS-CoV-2 infection: Vero E6 cells were cultured in T-175 flasks in Dulbecco’s modified Eagle’s medium (DMEM; Lonza, cat. 12-604F), supplemented with 8% fetal calf serum (FCS; Bodinco), 2 mM L-glutamine, 100 U/mL of penicillin and 100 µg/mL of streptomycin (Sigma Aldrich, cat.
    Vero E6
    suggested: RRID:CVCL_XD71)
    Software and Algorithms
    SentencesResources
    Multiplex SHAPE-MaP of SARS-CoV-2 RNA: For multiplex SHAPE-MaP, 70 oligonucleotide pairs, tiling the entire length of the SARS-CoV-2 genome (29,903 nt), were automatically designed using Primer3 (Untergasser et al., 2012) and the following parameters: amplicon size between 480 and 520 bp, maximum poly(N) length between 2 and 3, minimum/optimal/maximum oligonucleotide size of 20/25/30, minimum/optimal/maximum Tm of 56/60/62 degrees, minimum/optimal/maximum GC content of 30/50/65 %.
    Primer3
    suggested: (Primer3, RRID:SCR_003139)
    Primers were then searched against the GENCODE v33 human transcriptome, keeping only those with less than 60% predicted base-pairing or more than 60% predicted base-pairing and more than 2 mismatched bases at the 3′ end.
    GENCODE
    suggested: (GENCODE, RRID:SCR_014966)
    Reads were trimmed of terminal Ns and low-quality bases (Phred < 20)
    Phred
    suggested: (Phred, RRID:SCR_001017)
    After calibrating the CM using the cmcalibrate module, it was used to search for RNA homologs in a database composed of all the non-redundant coronavirus complete genome sequences from the ViPR database (https://www.viprbrc.org/brc/home.spg?decorator=corona; Pickett et al., 2011), as well as a set of representative coronavirus genomes from NCBI database, using the cmsearch module.
    ViPR
    suggested: (vipR, RRID:SCR_010685)
    NCBI
    suggested: (NCBI, RRID:SCR_006472)
    Determination of low Shannon – high SHAPE regions’ conservation: To assess the sequence conservation of the identified low Shannon – high SHAPE regions, we computed 4 multiple sequence alignments using MAFFT v7.429 (parameters: --maxiterate 100 --auto; Katoh and Standley, 2013), the reference SARS-CoV-2 sequence and one of the following datasets: 1) SARS-CoV (243 sequences); 2) MERS-CoV (281 sequences); 3) other Beta-CoV (excluding SARS-CoV/SARS-CoV-2/MERS-CoV, 681 sequences); 4) other CoV (excluding Beta-CoV, 1657 sequences).
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.