Identification of evolutionary trajectories shared across human betacoronaviruses

This article has been Reviewed by the following groups

Read the full article

Abstract

Comparing the evolution of distantly related viruses can provide insights into common adaptive processes related to shared ecological niches. Phylogenetic approaches, coupled with other molecular evolution tools, can help identify mutations informative on adaptation, whilst the structural contextualization of these to functional sites of proteins may help gain insight into their biological properties. Two zoonotic betacoronaviruses capable of sustained human-to-human transmission have caused pandemics in recent times (SARS-CoV-1 and SARS-CoV-2), whilst a third virus (MERS-CoV) is responsible for sporadic outbreaks linked to animal infections. Moreover, two other betacoronaviruses have circulated endemically in humans for decades (HKU1 and OC43). To search for evidence of adaptive convergence between established and emerging betacoronaviruses capable of sustained human-to-human transmission (HKU1, OC43, SARS-CoV-1 and SARS-CoV-2), we developed a methodological pipeline to classify shared non-synonymous mutations as putatively denoting homoplasy (repeated mutations that do not share direct common ancestry) or stepwise evolution (sequential mutations leading towards a novel genotype). In parallel, we look for evidence of positive selection, and draw upon protein structure data to identify potential biological implications. We find 30 mutations, with four of these [codon sites 18121 (nsp14/residue 28), 21623 (spike/21), 21635 (spike/25) and 23948 (spike/796); SARS-CoV-2 genome numbering] displaying evolution under positive selection and proximity to functional protein regions. Our findings shed light on potential mechanisms underlying betacoronavirus adaptation to the human host and pinpoint common mutational pathways that may occur during establishment of human endemicity.

Article activity feed

  1. SciScore for 10.1101/2021.05.24.445313: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Maximum likelihood phylogenies for the individual and global alignments were estimated using RAxML v8 (Stamatakis 2015) under a general time reversible nucleotide substitution model with gamma-distributed among-site rate variation (GTR+G) and branch support assessed using 100 bootstrap replicates.
    RAxML
    suggested: (RAxML, RRID:SCR_006086)
    In parallel, conserved amino acid states within the alignment were identified and extracted by using a profile-to profile alignment comparison of global consensus in amino acid sequences generated under a 99% threshold, and re-aligned using MAFFT v 7.471 (Katoh and Standley 2013) (
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    Mapping mutations onto betacoronavirus protein structures: To relate the positions of amino acid changes to regions of known protein function, the mutations identified in section 3 were mapped using PyMOL v 2.4.0 (https://pymol.org/2/) onto the available protein structures listed in Table 2 and in Data Availability section.
    PyMOL
    suggested: (PyMOL, RRID:SCR_000305)
    Recombinant sequences identified were removed using ClonalFrameML (Didelot and Wilson 2015), whilst non-recombinant fragments were verified using GARD (Kosakovsky Pond, et al. 2006).
    ClonalFrameML
    suggested: (Clonalframe, RRID:SCR_016060)
    For each site of interest, coded amino acid traits were mapped onto the nodes of the MCC tree by performing reconstructions of ancestral states under an asymmetric discrete trait evolution model (DTA) in BEAST v1.8.4 (Lemey, et al. 2009; Suchard, et al. 2018).
    BEAST
    suggested: (BEAST, RRID:SCR_010228)

    Results from OddPub: Thank you for sharing your code and data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Other limitations of our study include (i) the low availability of genomes sampled longitudinally through time (especially for HKU1 and SARS-CoV-1), and (ii) the low genetic variability for SARS-CoV-2 (Rausch, et al. 2020), which restrict the statistical power to detect mutations likely to denote adaptation (van Dorp, Richard, et al. 2020). Further, it is not possible to be certain that the mutations identified by our pipeline are indeed adaptive, as apparent homoplasy and stepwise evolution can also result from non-adaptive evolutionary processes such as genetic drift, mutational hitchhiking, and mutational rate biases (Delport, et al. 2008; Pond, et al. 2012; De Maio, et al. 2020; Simmonds 2020; Wang, et al. 2021). Further genomic surveillance of these viruses, as well as other beta-coronaviruses that may potentially emerge, will be necessary to confirm that the mutational panel presented here may represent common pathways reflecting betacoronavirus adaptation to the human host. The mutations identified here may be informative on ongoing adaptation of betacoronavirus circulating in the human population, but require further experimental evidence to interpret their adaptive effect and biological significance.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.