The effects of host phylogenetic coverage and congruence metric on Monte Carlo-based null models of phylosymbiosis
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (Peer Community in Evolutionary Biology)
Abstract
Variation in host-associated microbial communities often parallels patterns of phylogenetic divergence between hosts, a pattern known as phylosymbiosis. Understanding of this phenomenon relies initially on quantifying phylosymbiotic signals from across a broad range of host taxa. Quantifying signals of phylosymbiosis is typically achieved by calculating how congruent a host’s phylogenetic tree is with a dendrogram that represents patterns of dissimilarity in their associated microbial communities. To statistically assess the degree of congruence, several studies have constructed null models using a Monte Carlo approach to randomly sample trees. Although this approach is becoming more common, it has several features that warrant benchmarking to advise its further use. This approach relies on quantification of congruence between a host’s phylogenetic tree its microbial community dendrogram. Therefore, it is important to establish how choice of congruence metric influences null model-based inferences. Furthermore, phylosymbiotic signals may manifest at different scales of host divergence, and it is important to establish the extent of host phylogenetic breadth needed to reliably detect a phylosymbiotic signal. To help improve our study of phylosymbiosis, here I examine how power and type 1 error (false positive) rates associated with this approach varies with choice of congruence metric and host phylogenetic coverage. Furthermore, I examine variation in sensitivity given uncertainty in tree estimation, as well as how well null congruence models align with expectations of community assembly that is completely neutral with respect to host phylogeny. I generally found that model performance increased rapidly with increasing tree sizes, suggesting lower limits on the host phylogenetic breadth needed to reliably detect phylosymbiotic signals with this approach. Furthermore, I found several notable variations in performance between congruence metrics, which translated into different inferences regarding signal detection. Overall, these findings suggest that Monte Carlo sampling across tree space can be an effective way to quantify phylosymbiotic signals and highlight key considerations for its implementation.
