Likelihood-based evaluation of character recoding schemes for phylogenetic analysis

Tae-Kun Seo
Jeffrey L. Thorne

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Character recoding is a common practice in evolutionary studies. For example, phylogenies can be inferred from protein-coding DNA sequences via 61-state codon substitution models. However, often the inference is instead done by adopting 20-state amino acid replacement models that do not explicitly consider synonymous substitution. When there is substantial heterogeneity of amino acid frequencies among sites and/or among lineages, another sort of character recoding is sometimes performed. In these cases, one option is to reduce the state space of the model by placing each of the 20 amino acids into one of a relatively small number (e.g., 6) of groups of amino acids and then modeling only how group membership changes at a site over evolutionary time. Unfortunately, these kinds of character recoding schemes are prone to reducing the amount of available evolutionary information. Here, we provide a likelihood framework to statistically assess recoding schemes. Although we concentrate on the recoding of 61-state codon substitution models into 20-state amino acid replacement models, the general approach is also relevant to other recoding schemes such as those that recode 20-state models into 6-state models.

Version published to 10.1101/2025.10.15.682473 on bioRxiv
Oct 16, 2025

Testing the validity and adequacy of linguistic phylogenetic analyses

This article has 1 author:
1. Benedict King
This article has no evaluationsLatest version Dec 17, 2025
The heterogeneous selection landscape of genome evolution in prokaryotes

This article has 5 authors:
1. Eugene Koonin
2. Sofiya Garushyants
3. Svetlana Karamycheva
4. Nash Rochman
5. Yuri Wolf
This article has no evaluationsLatest version Dec 12, 2025
Molecular Evolution of the <i>Fusion</i> (<i>F</i>) Genes in Human Metapneumovirus Genotype B

This article has 10 authors:
1. Tatsuya Shirai
2. Fuminori Mizukoshi
3. Mitsuru Sada
4. Kazuya Shirato
5. Takeshi Saraya
6. Haruyuki Ishii
7. Ryusuke Kimura
8. Toshiyuki Sugai
9. Akihide Ryo
10. Hirokazu Kimura
This article has no evaluationsLatest version Dec 23, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Testing the validity and adequacy of linguistic phylogenetic analyses

The heterogeneous selection landscape of genome evolution in prokaryotes

Molecular Evolution of the <i>Fusion</i> (<i>F</i>) Genes in Human Metapneumovirus Genotype B