Coronavirus genomes carry the signatures of their habitats
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Coronaviruses such as SARS-CoV-2 regularly infect host tissues that express antiviral proteins (AVPs) in abundance. Understanding how they evolve to adapt or evade host immune responses is important in the effort to control the spread of infection. Two AVPs that may shape viral genomes are the zinc finger antiviral protein (ZAP) and the apolipoprotein B mRNA editing enzyme-catalytic polypeptide-like 3 (APOBEC3). The former binds to CpG dinucleotides to facilitate the degradation of viral transcripts while the latter frequently deaminates C into U residues which could generate notable viral sequence variations. We tested the hypothesis that both APOBEC3 and ZAP impose selective pressures that shape the genome of an infecting coronavirus. Our investigation considered a comprehensive number of publicly available genomes for seven coronaviruses (SARS-CoV-2, SARS-CoV, and MERS infecting Homo sapiens , Bovine CoV infecting Bos taurus , MHV infecting Mus musculus , HEV infecting Sus scrofa , and CRCoV infecting Canis lupus familiaris ). We show that coronaviruses that regularly infect tissues with abundant AVPs have CpG-deficient and U-rich genomes; whereas those that do not infect tissues with abundant AVPs do not share these sequence hallmarks. Among the coronaviruses surveyed herein, CpG is most deficient in SARS-CoV-2 and a temporal analysis showed a marked increase in C to U mutations over four months of SARS-CoV-2 genome evolution. Furthermore, the preferred motifs in which these C to U mutations occur are the same as those subjected to APOBEC3 editing in HIV-1. These results suggest that both ZAP and APOBEC3 shape the SARS-CoV-2 genome: ZAP imposes a strong CpG avoidance, and APOBEC3 constantly edits C to U. Evolutionary pressures exerted by host immune systems onto viral genomes may motivate novel strategies for SARS-CoV-2 vaccine development.
Article activity feed
-
-
SciScore for 10.1101/2020.06.13.149591: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization Next, among 2666 high sequence quality and complete SARS-CoV-2 genomes from CNCB, we randomly selected one genome from each collection date, inclusively between December 31, 2019 (first isolate) and May 6, 2020 (most recent isolate, database last accessed on May 16, 2020), that have complete records of local region annotations and nucleotide sequences in NCBI. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Experimental Models: Organisms/Strains Sentences Resources A total of 99 variants (or samples) were retrieved across 127 days since SARS-CoV-2 (strain Wuhan-Hu-1, MN908947) was … SciScore for 10.1101/2020.06.13.149591: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization Next, among 2666 high sequence quality and complete SARS-CoV-2 genomes from CNCB, we randomly selected one genome from each collection date, inclusively between December 31, 2019 (first isolate) and May 6, 2020 (most recent isolate, database last accessed on May 16, 2020), that have complete records of local region annotations and nucleotide sequences in NCBI. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Experimental Models: Organisms/Strains Sentences Resources A total of 99 variants (or samples) were retrieved across 127 days since SARS-CoV-2 (strain Wuhan-Hu-1, MN908947) was first sequenced. Wuhan-Hu-1suggested: NoneSoftware and Algorithms Sentences Resources Specifically, we calculated the proportion of mRNA expression (PME) as: PME values were calculated from averaged TPM values in 24 human tissues using all RNA-Seq datasets available in the GTEx Portal (Lonsdale et al. 2013), from averaged FPKM values in 26 cattle tissues using the Bovine Genome Database (Shamimuzzaman et al. 2019), from averaged FPKM values in 33 pig tissues using TISSUE 2.0 integrated datasets (Palasca et al. 2018), from averaged FPKM values in 17 mice tissues using all 741 RNA-Seq datasets in mouse ENCODE consortium (Yue et al. 2014), from averaged FPKM values in 12 mice tissues using 79 RNA-Seq datasets in BioProject PRJNA516470 (Naqvi et al. 2019), and from averaged fluorescence intensity units in 10 dog tissues using all 39 microarray datasets in BioProject PRJNA124245 (Briggs et al. 2011).
BioProjectsuggested: (NCBI BioProject, RRID:SCR_004801)Additionally, the complete genomic sequences of 403 MERS strains, 134 SARS-CoV strains, 20 Bovine CoV strains, 2 Canine CoV strains, 26 Murine HEV strains, and 10 Porcine HEV strains were downloaded from the National Center for Biotechnology Information (NCBI) Nucleotide Database (https://www.ncbi.nlm.nih.gov/). suggested: (GENSAT at NCBI - Gene Expression Nervous System Atlas, RRID:SCR_003923)To make a fair comparison between strains, the genomes were aligned with MAFFT version 7 (Katoh and Standley 2013), with the slow but accurate G-INS-1 option for 134 SARS-CoV, 20 Bovine CoV, 2 Canine CoV, 26 Murine MHV, and 10 Porcine HEV strains, and with the fast FFT-NS-2 option for large alignments for 2666 SARS-CoV-2 and 403 MERS strains. MAFFTsuggested: (MAFFT, RRID:SCR_011811)Results from OddPub: Thank you for sharing your data.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-