Genome-wide mapping of SARS-CoV-2 RNA structures identifies therapeutically-relevant elements

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

SARS-CoV-2 is a betacoronavirus with a linear single-stranded, positive-sense RNA genome, whose outbreak caused the ongoing COVID-19 pandemic. The ability of coronaviruses to rapidly evolve, adapt, and cross species barriers makes the development of effective and durable therapeutic strategies a challenging and urgent need. As for other RNA viruses, genomic RNA structures are expected to play crucial roles in several steps of the coronavirus replication cycle. Despite this, only a handful of functionally-conserved coronavirus structural RNA elements have been identified to date. Here, we performed RNA structure probing to obtain single-base resolution secondary structure maps of the full SARS-CoV-2 coronavirus genome both in vitro and in living infected cells. Probing data recapitulate the previously described coronavirus RNA elements (5′ UTR and s2m), and reveal new structures. Of these, ∼10.2% show significant covariation among SARS-CoV-2 and other coronaviruses, hinting at their functionally-conserved role. Secondary structure-restrained 3D modeling of these segments further allowed for the identification of putative druggable pockets. In addition, we identify a set of single-stranded segments in vivo, showing high sequence conservation, suitable for the development of antisense oligonucleotide therapeutics. Collectively, our work lays the foundation for the development of innovative RNA-targeted therapeutic strategies to fight SARS-related infections.

Article activity feed

  1. SciScore for 10.1101/2020.06.15.151647: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Experimental Models: Cell Lines
    SentencesResources
    Cell culture and SARS-CoV-2 infection: Vero E6 cells were cultured in T-175 flasks in Dulbecco’s modified Eagle’s medium (DMEM; Lonza, cat. 12-604F), supplemented with 8% fetal calf serum (FCS; Bodinco), 2 mM L-glutamine, 100 U/mL of penicillin and 100 µg/mL of streptomycin (Sigma Aldrich, cat.
    Vero E6
    suggested: RRID:CVCL_XD71)
    Software and Algorithms
    SentencesResources
    Multiplex SHAPE-MaP of SARS-CoV-2 RNA: For multiplex SHAPE-MaP, 70 oligonucleotide pairs, tiling the entire length of the SARS-CoV-2 genome (29,903 nt), were automatically designed using Primer3 (Untergasser et al., 2012) and the following parameters: amplicon size between 480 and 520 bp, maximum poly(N) length between 2 and 3, minimum/optimal/maximum oligonucleotide size of 20/25/30, minimum/optimal/maximum Tm of 56/60/62 degrees, minimum/optimal/maximum GC content of 30/50/65 %.
    Primer3
    suggested: (Primer3, RRID:SCR_003139)
    Primers were then searched against the GENCODE v33 human transcriptome, keeping only those with less than 60% predicted base-pairing or more than 60% predicted base-pairing and more than 2 mismatched bases at the 3′ end.
    GENCODE
    suggested: (GENCODE, RRID:SCR_014966)
    Reads were trimmed of terminal Ns and low-quality bases (Phred < 20)
    Phred
    suggested: (Phred, RRID:SCR_001017)
    After calibrating the CM using the cmcalibrate module, it was used to search for RNA homologs in a database composed of all the non-redundant coronavirus complete genome sequences from the ViPR database (https://www.viprbrc.org/brc/home.spg?decorator=corona; Pickett et al., 2011), as well as a set of representative coronavirus genomes from NCBI database, using the cmsearch module.
    ViPR
    suggested: (vipR, RRID:SCR_010685)
    NCBI
    suggested: (NCBI, RRID:SCR_006472)
    Determination of low Shannon – high SHAPE regions’ conservation: To assess the sequence conservation of the identified low Shannon – high SHAPE regions, we computed 4 multiple sequence alignments using MAFFT v7.429 (parameters: --maxiterate 100 --auto; Katoh and Standley, 2013), the reference SARS-CoV-2 sequence and one of the following datasets: 1) SARS-CoV (243 sequences); 2) MERS-CoV (281 sequences); 3) other Beta-CoV (excluding SARS-CoV/SARS-CoV-2/MERS-CoV, 681 sequences); 4) other CoV (excluding Beta-CoV, 1657 sequences).
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.