What is the best method for estimating ancestral states from discrete characters?
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Ancestral state estimation is a formal phylogenetic method for inferring the nature of ancestors and performing tests of character evolution. As such, it is among the most important tools available to evolutionary biologists. However, there are a profusion of methods available, the accuracy of which remains unclear. Here I use a simulation approach to test between parsimony and likelihood methods for estimating ancestral states from discrete binary characters. I simulate 500 characters using 15 different Markov generating models, a range of tree sizes (8-256 tips) and three topologies representing end members of tree symmetry and branch length heterogeneity. Simulated tip states were subjected to ancestral state estimation under the Equal Rates (ER) and All-Rates-Different (ARD) models, as well as under parsimony assuming accelerated transformations (ACCTRAN). The results demonstrate that both parsimony and likelihood approaches obtain high accuracy applied to trees with more tips. Parsimony performs poorly when trees contain long branches, whereas the ER model performs well across simulations and is reasonably robust to model violation. The ER model frequently outperforms the ARD model, even when data are simulated using unequal rates. Furthermore, the ER model exhibits less transition rate error when compared to ER models. These results suggest that ARD models may be overparameterized when character data is limited. Surprisingly, the difference in likelihood-based information criteria between models was found to be a poor predictor of difference in model error; better fitting models are not necessarily more accurate. However, there is a strong correlation between model uncertainty and model error; likelihood models with more certain ancestral state estimates are typically more accurate. Using empirical morphological datasets, I demonstrate that applying different methods often results in substantively different ancestral state estimates. The results of the simulation study highlight the importance of incorporating fossils in ancestral state estimation. Fossils increase the total number of tips, break long branches and are closer to internal nodes, thereby lowering average branch length and overall branch length heterogeneity of trees. These factors will all contribute to increasing the accuracy of ancestral state estimates, irrespective of the method used.