The mutation profile of SARS-CoV-2 is primarily shaped by the host antiviral defense

This article has been Reviewed by the following groups

Read the full article

Abstract

Understanding SARS-CoV-2 evolution is a fundamental effort in coping with the COVID-19 pandemic. The virus genomes have been broadly evolving due to the high number of infected hosts world-wide. Mutagenesis and selection are the two inter-dependent mechanisms of virus diversification. However, which mechanisms contribute to the mutation profiles of SARS-CoV-2 remain under-explored. Here, we delineate the contribution of mutagenesis and selection to the genome diversity of SARS-CoV-2 isolates. We generated a comprehensive phylogenetic tree with representative genomes. Instead of counting mutations relative to the reference genome, we identified each mutation event at the nodes of the phylogenetic tree. With this approach, we obtained the mutation events that are independent of each other and generated the mutation profile of SARS-CoV-2 genomes. The results suggest that the heterogeneous mutation patterns are mainly reflections of host (i) antiviral mechanisms that are achieved through APOBEC, ADAR, and ZAP proteins and (ii) probable adaptation against reactive oxygen species.

Importance

SARS-CoV-2 genomes are evolving worldwide. Revealing the evolutionary characteristics of SARS-CoV-2 is essential to understand host-virus interactions. Here, we aim to understand whether mutagenesis or selection is the primary driver of SARS-CoV-2 evolution. This study provides an unbiased computational method for profiling and analyzing independently occurring SARS-CoV-2 mutations. The results point out three host antiviral mechanisms shaping the mutational profile of SARS-CoV-2 through APOBEC, ADAR, and ZAP proteins. Besides, reactive oxygen species might have an impact on the SARS-CoV-2 mutagenesis.

Article activity feed

  1. SciScore for 10.1101/2021.02.02.429486: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Then, the cd-hit program was used to cluster sequences and choose representatives (-c 0.999 -M 0 -T 80) (Fu, Niu, Zhu, Wu, & Li, 2012). 20,089 clusters were created, of which 18,334 contained only a single sequence, while 1,755 of them contained multiple sequences (up to 14,867sequences in a cluster).
    cd-hit
    suggested: (CD-HIT, RRID:SCR_007105)
    After representative selection, representative sequences were aligned with the MAFFT algorithm using Augur toolkit (Hadfield et al., 2018; Katoh & Standley, 2016).
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    Then, a phylogenetic tree was constructed using IQ-TREE (-fast -n AUTO -m GTR).
    IQ-TREE
    suggested: (IQ-TREE, RRID:SCR_017254)
    The deformation counts were normalized by their division with their observation counts in the reference genome and plotted with their formation counts by ggplot2 in R studio (Wickham, 2016).
    ggplot2
    suggested: (ggplot2, RRID:SCR_014601)

    Results from OddPub: Thank you for sharing your code and data.


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • No conflict of interest statement was detected. If there are no conflicts, we encourage authors to explicit state so.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.