State Space Misspecification in Morphological Phylogenetics: A Pitfall for Models and Parsimony Alike

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Phylogenetic analysis relies on two fundamental levels of biological information: genotype and phenotype. Molecular data benefit from operating within a well-defined, finite state space (e.g., nucleotide alphabets), whereas morphological data present inherent challenges due to frequently ambiguous character states and variable state counts. In this study, I use simulated data to examine how state space misspecification (SSM), defined as the mismatch between the assumed and true state space, affects phylogenetic reconstruction. Results show that SSM generally reduces topological accuracy, with the extent of its impact depending on mutation rate, state space disparity, and the proportion of affected characters. Counterintuitively, under conditions typical of empirical morphological datasets (high proportions of binary characters and elevated mutation rates), SSM can improve topological precision. This creates a paradox where an incorrect model outperforms a correct one, though at the cost of distorted branch lengths. Importantly, the effects of SSM extend beyond model-based approaches. I demonstrate, through an extension of the no common mechanism (NCM) model, that standard maximum parsimony is consistent with the assumption that characters evolved under an SSM model—a largely overlooked feature. To address this, I propose a state-space-aware weighting scheme that accounts for variation in character state space. I also discuss additional strategies for mitigating SSM, including model adjustments and reducing reliance on oversimplified binary coding. This work underscores the need to explicitly address state space uncertainty in morphological phylogenetics. As morphology remains crucial for reconstructing deep-time lineages and integrating fossils, accounting for SSM is essential to improving the reliability of evolutionary trees.

Article activity feed