RNA genome conservation and secondary structure in SARS-CoV-2 and SARS-related viruses
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (ScreenIT)
Abstract
As the COVID-19 outbreak spreads, there is a growing need for a compilation of conserved RNA genome regions in the SARS-CoV-2 virus along with their structural propensities to guide development of antivirals and diagnostics. Using sequence alignments spanning a range of betacoronaviruses, we rank genomic regions by RNA sequence conservation, identifying 79 regions of length at least 15 nucleotides as exactly conserved over SARS-related complete genome sequences available near the beginning of the COVID-19 outbreak. We then confirm the conservation of the majority of these genome regions across 739 SARS-CoV-2 sequences reported to date from the current COVID-19 outbreak, and we present a curated list of 30 ‘SARS-related-conserved’ regions. We find that known RNA structured elements curated as Rfam families and in prior literature are enriched in these conserved genome regions, and we predict additional conserved, stable secondary structures across the viral genome. We provide 106 ‘SARS-CoV-2-conserved-structured’ regions as potential targets for antivirals that bind to structured RNA. We further provide detailed secondary structure models for the 5’ UTR, frame-shifting element, and 3’ UTR. Last, we predict regions of the SARS-CoV-2 viral genome have low propensity for RNA secondary structure and are conserved within SARS-CoV-2 strains. These 59 ‘SARS-CoV-2-conserved-unstructured’ genomic regions may be most easily targeted in primer-based diagnostic and oligonucleotide-based therapeutic strategies.
Article activity feed
-
SciScore for 10.1101/2020.03.27.012906: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources These computations were carried out using the Arnie package (https://github.com/DasLab/arnie). Arniesuggested: (ARNIE, RRID:SCR_000514)The repository additionally includes alignment files, Rfam families and covariance models, and output from the RNAz, R-scape, alifoldz and RNAplfold analyses. Rfamsuggested: (Rfam, RRID:SCR_007891)Results from OddPub: Thank you for sharing your code.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: …
SciScore for 10.1101/2020.03.27.012906: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources These computations were carried out using the Arnie package (https://github.com/DasLab/arnie). Arniesuggested: (ARNIE, RRID:SCR_000514)The repository additionally includes alignment files, Rfam families and covariance models, and output from the RNAz, R-scape, alifoldz and RNAplfold analyses. Rfamsuggested: (Rfam, RRID:SCR_007891)Results from OddPub: Thank you for sharing your code.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- No conflict of interest statement was detected. If there are no conflicts, we encourage authors to explicit state so.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-
