The effects of host phylogenetic coverage and congruence metric on Monte Carlo-based null models of phylosymbiosis

James G. DuBose

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (Peer Community in Evolutionary Biology)

Abstract

Variation in host-associated microbial communities often parallels patterns of phylogenetic divergence between hosts, a pattern known as phylosymbiosis. Understanding of this phenomenon relies initially on quantifying phylosymbiotic signals from across a broad range of host taxa. Quantifying signals of phylosymbiosis is typically achieved by calculating how congruent a host’s phylogenetic tree is with a dendrogram that represents patterns of dissimilarity in their associated microbial communities. To statistically assess the degree of congruence, several studies have constructed null models using a Monte Carlo approach to randomly sample trees. Although this approach is becoming more common, it has several features that warrant benchmarking to advise its further use. This approach relies on quantification of congruence between a host’s phylogenetic tree its microbial community dendrogram. Therefore, it is important to establish how choice of congruence metric influences null model-based inferences. Furthermore, phylosymbiotic signals may manifest at different scales of host divergence, and it is important to establish the extent of host phylogenetic breadth needed to reliably detect a phylosymbiotic signal. To help improve our study of phylosymbiosis, here I examine how power and type 1 error (false positive) rates associated with this approach varies with choice of congruence metric and host phylogenetic coverage. Furthermore, I examine variation in sensitivity given uncertainty in tree estimation, as well as how well null congruence models align with expectations of community assembly that is completely neutral with respect to host phylogeny. I generally found that model performance increased rapidly with increasing tree sizes, suggesting lower limits on the host phylogenetic breadth needed to reliably detect phylosymbiotic signals with this approach. Furthermore, I found several notable variations in performance between congruence metrics, which translated into different inferences regarding signal detection. Overall, these findings suggest that Monte Carlo sampling across tree space can be an effective way to quantify phylosymbiotic signals and highlight key considerations for its implementation.

Version published to 10.24072/pcjournal.667
Dec 15, 2025
Peer Community in Evolutionary Biology
Dec 12, 2025

Read the original source
Version published to 10.1101/2025.02.07.637028 on bioRxiv
Feb 8, 2025

Testing the validity and adequacy of linguistic phylogenetic analyses

This article has 1 author:
1. Benedict King
This article has no evaluationsLatest version Dec 17, 2025
Bayesian phylogenetic analyses cannot be used to test hypotheses about the evolution of large-scale complex societies during the Holocene

This article has 1 author:
1. Peter Turchin
This article has no evaluationsLatest version Dec 19, 2025
Bayes Factor Hypothesis Testing in Meta-Analyses: Practical Advantages and Methodological Considerations

This article has 2 authors:
1. Joris Mulder
2. Robbie Cornelis Maria van Aert
This article has no evaluationsLatest version Dec 11, 2025

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Testing the validity and adequacy of linguistic phylogenetic analyses

Bayesian phylogenetic analyses cannot be used to test hypotheses about the evolution of large-scale complex societies during the Holocene

Bayes Factor Hypothesis Testing in Meta-Analyses: Practical Advantages and Methodological Considerations