SARS-CoV-2 convergent evolution cannot be reliably inferred from phylogenetic analyses
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
A homoplasy is a trait shared between individuals that did not arise in a common ancestor, but rather is the result of convergent evolution. SARS-CoV-2 homoplasic mutations are important to characterise, because the evidence for a mutation conferring a fitness advantage is strengthened if this mutation has evolved independently and repeatedly in separate viral lineages. Yet detecting homoplasy is difficult due to insufficient variation between sequences to construct reliable phylogenetic trees. Here, we develop a method to more robustly identify confident homoplasies. We derive a maximum likelihood (ML) tree, with taxa bearing seemingly recurrent mutations dispersed across the tree, and then, for each potentially homoplasic mutation, we derive an alternative tree where the same taxa are constrained to one clade such that the mutation is no longer homoplasic. We then compare how well the two trees fit the sequence data. Applying this method to SARS-CoV-2 yields only a few instances where the constrained trees have significantly less statistical support than unconstrained tree, suggesting phylogenetics can provide limited support for homoplasy in SARS-CoV-2 and that caution is needed when inferring evidence of convergent evolution from phylogenetic methods in the absence of evidence from other sources.
Article activity feed
-
SciScore for 10.1101/2021.05.15.444301: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Experimental Models: Organisms/Strains Sentences Resources For each identified homoplasy, we constructed a constraint tree in the form ((a1,a2,a3…),(b1,b1,b3…)), where (a1,a2,a3…) included the taxa downstream of any of the homoplasic changes that had the resulting amino acid; and (b1,b1,b3…) comprised taxa that were not downstream of any of these changes and had a different amino acid. b1suggested: NoneSoftware and Algorithms Sentences Resources Corresponding full-length human SARS-CoV-2 sequences were retrieved from the Global Initiative on Sharing All Influenza Database (GISAID) (https://www.gisaid.org/), which they align to the … SciScore for 10.1101/2021.05.15.444301: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Experimental Models: Organisms/Strains Sentences Resources For each identified homoplasy, we constructed a constraint tree in the form ((a1,a2,a3…),(b1,b1,b3…)), where (a1,a2,a3…) included the taxa downstream of any of the homoplasic changes that had the resulting amino acid; and (b1,b1,b3…) comprised taxa that were not downstream of any of these changes and had a different amino acid. b1suggested: NoneSoftware and Algorithms Sentences Resources Corresponding full-length human SARS-CoV-2 sequences were retrieved from the Global Initiative on Sharing All Influenza Database (GISAID) (https://www.gisaid.org/), which they align to the reference Wuhan sequence using MAFFT. MAFFTsuggested: (MAFFT, RRID:SCR_011811)The alignment was further adjusted by hand using AliView v1.26 (Larsson, 2014). AliViewsuggested: (AliView, RRID:SCR_002780)Statistical analysis: The AU test (Shimodaira, 2002) as implemented in IQ-TREE was used to determine whether the constrained, alternative tree topology was a significantly worse fit to the data than the unconstrained ML tree topology, for each homoplasic site. IQ-TREEsuggested: (IQ-TREE, RRID:SCR_017254)Trees were viewed on ChromaClade (Monit et al., 2019) and FigTree v1.4.4 (http://tree.bio.ed.ac.uk/software/figtree/). FigTreesuggested: (FigTree, RRID:SCR_008515)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-