Genome-wide mapping of therapeutically-relevant SARS-CoV-2 RNA structures

Abstract

SARS-CoV-2 is a betacoronavirus with a linear single-stranded, positive-sense RNA genome of ∼30 kb, whose outbreak caused the still ongoing COVID-19 pandemic. The ability of coronaviruses to rapidly evolve, adapt, and cross species barriers makes the development of effective and durable therapeutic strategies a challenging and urgent need. As for other RNA viruses, genomic RNA structures are expected to play crucial roles in several steps of the coronavirus replication cycle. Despite this, only a handful of functionally conserved structural elements within coronavirus RNA genomes have been identified to date.

Here, we performed RNA structure probing by SHAPE-MaP to obtain a single-base resolution secondary structure map of the full SARS-CoV-2 coronavirus genome. The SHAPE-MaP probing data recapitulate the previously described coronavirus RNA elements (5′ UTR, ribosomal frameshifting element, and 3′ UTR), and reveal new structures. Secondary structure-restrained 3D modeling of highly-structured regions across the SARS-CoV-2 genome allowed for the identification of several putative druggable pockets. Furthermore, ∼8% of the identified structure elements show significant covariation among SARS-CoV-2 and other coronaviruses, hinting at their functionally-conserved role. In addition, we identify a set of persistently single-stranded regions having high sequence conservation, suitable for the development of antisense oligonucleotide therapeutics.

Collectively, our work lays the foundation for the development of innovative RNA-targeted therapeutic strategies to fight SARS-related infections.

SciScore for 10.1101/2020.06.15.151647: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Experimental Models: Cell Lines
Sentences	Resources
Cell culture and SARS-CoV-2 infection: Vero E6 cells were cultured in T-175 flasks in Dulbecco’s modified Eagle’s medium (DMEM; Lonza, cat. 12-604F), supplemented with 8% fetal calf serum (FCS; Bodinco), 2 mM L-glutamine, 100 U/mL of penicillin and 100 µg/mL of streptomycin (Sigma Aldrich, cat.	Vero E6 suggested: RRID:CVCL_XD71)
Software and Algorithms
Sentences	Resources
Multiplex SHAPE-MaP of SARS-CoV-2 RNA: For multiplex SHAPE-MaP, 70 oligonucleotide pairs, tiling the entire length of the SARS-CoV-2 genome (29,903 nt), were automatically designed using Primer3 (Untergasser et al., …

SciScore for 10.1101/2020.06.15.151647: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Experimental Models: Cell Lines
Sentences	Resources
Cell culture and SARS-CoV-2 infection: Vero E6 cells were cultured in T-175 flasks in Dulbecco’s modified Eagle’s medium (DMEM; Lonza, cat. 12-604F), supplemented with 8% fetal calf serum (FCS; Bodinco), 2 mM L-glutamine, 100 U/mL of penicillin and 100 µg/mL of streptomycin (Sigma Aldrich, cat.	Vero E6 suggested: RRID:CVCL_XD71)
Software and Algorithms
Sentences	Resources
Multiplex SHAPE-MaP of SARS-CoV-2 RNA: For multiplex SHAPE-MaP, 70 oligonucleotide pairs, tiling the entire length of the SARS-CoV-2 genome (29,903 nt), were automatically designed using Primer3 (Untergasser et al., 2012) and the following parameters: amplicon size between 480 and 520 bp, maximum poly(N) length between 2 and 3, minimum/optimal/maximum oligonucleotide size of 20/25/30, minimum/optimal/maximum Tm of 56/60/62 degrees, minimum/optimal/maximum GC content of 30/50/65 %.	Primer3 suggested: (Primer3, RRID:SCR_003139)
Primers were then searched against the GENCODE v33 human transcriptome, keeping only those with less than 60% predicted base-pairing or more than 60% predicted base-pairing and more than 2 mismatched bases at the 3′ end.	GENCODE suggested: (GENCODE, RRID:SCR_014966)
Reads were trimmed of terminal Ns and low-quality bases (Phred < 20)	Phred suggested: (Phred, RRID:SCR_001017)
After calibrating the CM using the cmcalibrate module, it was used to search for RNA homologs in a database composed of all the non-redundant coronavirus complete genome sequences from the ViPR database (https://www.viprbrc.org/brc/home.spg?decorator=corona; Pickett et al., 2011), as well as a set of representative coronavirus genomes from NCBI database, using the cmsearch module.	ViPR suggested: (vipR, RRID:SCR_010685) NCBI suggested: (NCBI, RRID:SCR_006472)
Determination of low Shannon – high SHAPE regions’ conservation: To assess the sequence conservation of the identified low Shannon – high SHAPE regions, we computed 4 multiple sequence alignments using MAFFT v7.429 (parameters: --maxiterate 100 --auto; Katoh and Standley, 2013), the reference SARS-CoV-2 sequence and one of the following datasets: 1) SARS-CoV (243 sequences); 2) MERS-CoV (281 sequences); 3) other Beta-CoV (excluding SARS-CoV/SARS-CoV-2/MERS-CoV, 681 sequences); 4) other CoV (excluding Beta-CoV, 1657 sequences).	MAFFT suggested: (MAFFT, RRID:SCR_011811)

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Read the original source

Genome-wide mapping of therapeutically-relevant SARS-CoV-2 RNA structures

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Rapid Phylogenomic Analysis of Thousands Outbreak‐Causing Viral Genomes Using Covary

Meta-analysis of functional genomics studies reveals conserved cellular pathways required by viruses of pandemic concern

Fusion protein pan-sarbecovirus vaccines elicit broadly protective immune responses targeting Clade 1a, 1b, and 3 sarbecoviruses

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Rapid Phylogenomic Analysis of Thousands Outbreak‐Causing Viral Genomes Using Covary

Meta-analysis of functional genomics studies reveals conserved cellular pathways required by viruses of pandemic concern

Fusion protein pan-sarbecovirus vaccines elicit broadly protective immune responses targeting Clade 1a, 1b, and 3 sarbecoviruses