Substitutions and codon usage in SARS-CoV-2 in mammals indicate natural selection and host adaptation

This article has been Reviewed by the following groups

Read the full article

Abstract

The outbreak of COVID-19, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, rapidly spread to create a global pandemic and has continued to spread across hosts from humans to animals, transmitting particularly effectively in mink. How SARS-CoV-2 evolves in animals and humans and the differences in the separate evolutionary processes remain unclear. We analyzed the composition and codon usage bias of SARS-CoV-2 in infected humans and animals. Compared with other animals, SARS-CoV-2 in mink had the most substitutions. The substitutions of cytidine in SARS-CoV-2 in mink account for nearly 50% of the substitutions, while those in other animals represent only 30% of the substitutions. The incidence of adenine transversion in SARS-CoV-2 in other animals is threefold higher than that in mink-CoV (the SARS-CoV-2 virus in mink). A synonymous codon usage analysis showed that SARS-CoV-2 is optimized to adapt in the animals in which it is currently reported, and all the animals showed decreased adaptability relative to that of humans, except for mink. A binding affinity analysis indicated that the spike protein of the SARS-CoV-2 variant in mink showed a greater preference for binding with the mink receptor ACE2 than with the human receptor, especially as the mutation Y453F and F486L in mink-CoV lead to improvement of binding affinity for mink receptor. Our study focuses on the divergence of SARS-CoV-2 genome composition and codon usage in humans and animals, indicating possible natural selection and current host adaptation.

Article activity feed

  1. SciScore for 10.1101/2021.04.04.438417: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Identification of mutations: The sequences were aligned using MEGA-X, and the single nucleotide polymorphisms were analyzed using the SNiPlay pipeline by uploading aligned Fasta format file (https://sniplay.southgreen.fr/cgi-bin/analysis_v3.cgi)(Dereeper et al., 2011).
    SNiPlay
    suggested: None
    The Datamonkey adaptive evolution server (http://www.datamonkey.org) was used to identify sites where only some of the branches have undergone selective pressure.
    http://www.datamonkey.org
    suggested: (DataMonkey, RRID:SCR_010278)
    The codon usage data of different hosts were retrieved from the codon usage database (http://www.kazusa.or.jp/codon/), and the relative synonymous codon usages (RSCUs) were analyzed using MEGA software.
    MEGA
    suggested: (Mega BLAST, RRID:SCR_011920)
    Comparisons of the predicted protein structures and pairwise comparisons were analyzed using PyMOL software.
    PyMOL
    suggested: (PyMOL, RRID:SCR_000305)
    Selective coefficient index: The selection coefficient index (S) of all SARS-CoV-2 codons was estimated by the FMutSel0 model in the program CODEML (PAML package) (Yang and Nielsen, 2008), The fitness parameter of the most common residues at each location is fixed to 0, while the other fitness parameters are limited to −20 < F < 20.
    PAML
    suggested: (PAML, RRID:SCR_014932)
    The figures were mapped by the software PRISM GraphPad 5.0.
    GraphPad
    suggested: (GraphPad Prism, RRID:SCR_002798)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: Please consider improving the rainbow (“jet”) colormap(s) used on page 28. At least one figure is not accessible to readers with colorblindness and/or is not true to the data, i.e. not perceptually uniform.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.