Intra-genome variability in the dinucleotide composition of SARS-CoV-2
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
CpG dinucleotides are under-represented in the genomes of single-stranded RNA viruses, and SARS-CoV-2 is no exception to this. Artificial modification of CpG frequency is a valid approach for live attenuated vaccine development; if this is to be applied to SARS-CoV-2, we must first understand the role CpG motifs play in regulating SARS-CoV-2 replication. Accordingly, the CpG composition of the SARS-CoV-2 genome was characterised. CpG suppression among coronaviruses does not differ between virus genera but does vary with host species and primary replication site (a proxy for tissue tropism), supporting the hypothesis that viral CpG content may influence cross-species transmission. Although SARS-CoV-2 exhibits overall strong CpG suppression, this varies considerably across the genome, and the Envelope (E) open reading frame (ORF) and ORF10 demonstrate an absence of CpG suppression. Across the Coronaviridae, E genes display remarkably high variation in CpG composition, with those of SARS and SARS-CoV-2 having much higher CpG content than other coronaviruses isolated from humans. This is an ancestrally derived trait reflecting their bat origins. Conservation of CpG motifs in these regions suggests that they have a functionality which over-rides the need to suppress CpG; an observation relevant to future strategies towards a rationally attenuated SARS-CoV-2 vaccine.
Article activity feed
-
-
SciScore for 10.1101/2020.05.08.083816: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Sequences were annotated into animal groups and genera based on their description in the NCBI database. NCBIsuggested: (NCBI, RRID:SCR_006472)Bats are of the order Chiroptera; multiple avian orders were grouped together (Galliformes, Anseriformes, Passeriformes, Gruiformes, Columbiformes and Pelicaniformes); even toed (Artiodactyla) and odd toed (Perissodactyla) ungulate orders were grouped, with camelids analysed separately due to their association with MERS-CoV (Azhar, et al. … SciScore for 10.1101/2020.05.08.083816: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Sequences were annotated into animal groups and genera based on their description in the NCBI database. NCBIsuggested: (NCBI, RRID:SCR_006472)Bats are of the order Chiroptera; multiple avian orders were grouped together (Galliformes, Anseriformes, Passeriformes, Gruiformes, Columbiformes and Pelicaniformes); even toed (Artiodactyla) and odd toed (Perissodactyla) ungulate orders were grouped, with camelids analysed separately due to their association with MERS-CoV (Azhar, et al. 2014); Canidae (canine) and Pantherinae (feline) sequences of the Carnivora order were analysed separately, as canines have previously been suggested as an intermediate host species for SARS-CoV-2 (Xia 2020) and cat infections with SARS-CoV-2 have been reported (Shi, et al. 2020); humans were the only representatives from the Primate order; all remaining Carnivora, with the exception of a single civet sequence, belonged to the Mustelidae (mustelids); rodents belong to the Rodentia order; and swine belong to the Artiodactyla order; whales are also Artodactyla but swine were considered separately due to considerable interest in porcine coronaviruses (Vlasova, et al. 2020). SARS-CoV-2suggested: (Active Motif Cat# 91351, RRID:AB_2847848)Phylogenetic analyses: E ORFs were aligned in MEGA X (Kumar, et al. 2018) using the Clustal method. MEGAsuggested: (Mega BLAST, RRID:SCR_011920)Statistical analyses: Comparison to determine whether there was a statistically significant difference across groups was performed using a 1-way ANOVA in GraphPad Prism. GraphPad Prismsuggested: (GraphPad Prism, RRID:SCR_002798)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:Another limitation of this analysis is that only sequences of greater than 10% divergence were included. Tissue tropism can be defined by much smaller differences; for example, a deletion in the spike protein of transmissible gastroenteritis virus (a porcine coronavirus) altered the tropism of the virus from enteric to respiratory, while nucleotide identity was preserved at 96% (Cox, et al. 1990; Rasschaert, et al. 1990). Further study on tissue tropisms of coronaviruses, as well as tissue expression profiles and antiviral activities of ZAP are needed to validate these analyses. Loss of CpG motifs during adaptation to the human host has been previously described for influenza A virus (Greenbaum, et al. 2008), highlighting the importance of CpG composition for host adaptation. For SARS-CoV-2, we determined a genomic CpG O:E ratio of 0.408, which is similar to the human genome CpG O:E ratio of 0.2-0.4 (McClelland and Ivarie 1982; Sved and Bird 1990; Tomso and Bell 2003). Mimicry of the CpG composition of the host by ssRNA viruses is considered a mechanism to subvert detection by the innate immune response (Simmonds, et al. 2013; Takata, et al. 2017) and speculatively this may indicate that SARS-CoV-2 was genetically predisposed to make a host switch into humans. Similarly, the genomic CPB score of 0.048 indicates that SARS-CoV-2 uses codon pairs which are preferentially utilised in the human ORFeome, which may mean that the virus was well suited for translational efficiency in ...
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-