The effects of host phylogenetic coverage and congruence metric on Monte Carlo-based null models of phylosymbiosis

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Log in to save this article

Abstract

Variation in host-associated microbial communities often parallels patterns of phylogenetic divergence between hosts, a pattern known as phylosymbiosis. Understanding of this phenomenon relies initially on quantifying phylosymbiotic signals from across a broad range of host taxa. Quantifying signals of phylosymbiosis is typically achieved by calculating how congruent a host’s phylogenetic tree is with a dendrogram that represents patterns of dissimilarity in their associated microbial communities. To statistically assess the degree of congruence, several studies have constructed null models using a Monte Carlo approach to randomly sample trees. Although this approach is becoming more common, it has several features that warrant benchmarking to advise its further use. This approach relies on quantification of congruence between a host’s phylogenetic tree its microbial community dendrogram. Therefore, it is important to establish how choice of congruence metric influences null model-based inferences. Furthermore, phylosymbiotic signals may manifest at different scales of host divergence, and it is important to establish the extent of host phylogenetic breadth needed to reliably detect a phylosymbiotic signal. To help improve our study of phylosymbiosis, here I examine how power and type 1 error (false positive) rates associated with this approach varies with choice of congruence metric and host phylogenetic coverage. Furthermore, I examine variation in sensitivity given uncertainty in tree estimation, as well as how well null congruence models align with expectations of community assembly that is completely neutral with respect to host phylogeny. I generally found that model performance increased rapidly with increasing tree sizes, suggesting lower limits on the host phylogenetic breadth needed to reliably detect phylosymbiotic signals with this approach. Furthermore, I found several notable variations in performance between congruence metrics, which translated into different inferences regarding signal detection. Overall, these findings suggest that Monte Carlo sampling across tree space can be an effective way to quantify phylosymbiotic signals and highlight key considerations for its implementation. 

Article activity feed

  1. For anyone seeking to understand how, "from (a) simple beginning, endless forms most beautiful and most wonderful have been, and are being, evolved" [Darwin, 1859], the relatively simple rules of template-based DNA synthesis, mutation and vertical inheritance offer powerful lenses into the past, provided selection and drift are taken into account. These principles allow us to read genomic similarity as a record of shared ancestry, giving rise to phylogenetics and its tree-like reconstructions of evolutionary history, commonly referred to as phylogenetic trees. 

    One striking observation when zeroing on specific lineages is that the community composition of microrganisms that live in and on them are more similar to one another than to the microbial communties of other hosts sharing the same environments or the same diets (the two factors more obviously linked to the microbial communities). Relationships between community compositions in these cases also generate tree-like branching patterns, more precisely called dendrograms, sometimes uncannily mirroring the phylogenetic relationships among the organisms who serve as their hosts. This sort of phylogenetic signal of microbial communities associated with related hosts is called phylosymbiosis [Brucker & Bordenstein, 2013]. While diverse ecological and evolutionary processes can generate mirrored host-microbiome 'trees' [Kohl, 2020], rigorous quantification is required to understand them. DuBose [2025] tackles this methodological challenge by evaluating how different tree-congruence metrics and null models perform when detecting phylosymbiotic signals.

    At least three approaches have been used to quantify phylosymbiosis [Lim & Bordenstein, 2020]. Matrix-correlation methods compare host phylogenetic distances with microbiome beta-diversity distances (quantified by Bray-Curtis, Jaccard or UniFrac methods), typically using Mantel [Groussin et al., 2017] or Procrustes tests [Qin et al., 2021]. Phylogenetic comparative methods, though less common in this context, evaluate phylogenetic signal through measures such as Blomberg's K and Pagel's lambda [Donohue et al., 2022] or process-based models like multivariate Brownian motion [Perez-Lamarque et al., 2023]. Topological congruence methods, the most widely used, assess whether a host phylogeny and a microbiome dendrogram are more topologically similar than expected by chance. But the "devil is in the details" regarding what constitutes "chance" and whether different topological congruence metrics agree on the strength of the signal. 

    DuBose offers a thorough study of both aspects. First, the author implements a clever neutral community assembly model (following reviewer suggestions) to compare against standard random tree sampling, validating the approach as a robust reference for testing null hypotheses. Second, the study evaluates a large range of congruence metrics available in the widely used TreeDist and phangorn packages. The reproducibility of this method was assessed by a PCI data editor and this recommender; users interested in testing the full range of metrics can easily modify the provided scripts.

    While one may miss alternative model trees to represent more biologically grounded processes and most results align with theoretical expectations (e.g. signal is more evident in larger trees), random tree sampling remains common yet under-benchmarked, reference in the literature [Brooks et al., 2016; Travelline et al., 2020; Graham et al.,  2025]. DuBose provides the first systematic exploration of its properties, demonstrating that Monte Carlo sampling across tree space serves as a meaningful null framework. This work clearly outlines key caveats and implementation choices researchers must consider, making it a necessary methodological resource for the field.

     

    References

    Andrew W. Brooks, Kevin D. Kohl, Robert M Brucker, Edward J van Opstal, Seth R Bordenstein (2016) Phylosymbiosis: relationships and functional effects of microbial communities across host evolutionary history. PLoS Biol. 14(11):e2000225. https://doi.org/10.1371/journal.pbio.2000225 

    Robert M. Brucker, Seth R. Bordenstein (2013) The hologenomic basis of speciation: gut bacteria cause hybrid lethality in the genus Nasonia. Science. 341(6146):667-9.  https://doi.org/10.1126/science.1240659

    Charles Darwin (1859) On the origin of species by means of natural selection. John Murray, London.

    Mariah E. Donohue, Amanda K. Rowe, Eric Kowalewski, Zoe L. Hert, Carly E. Karrick, Lovasoa J. Randriamanandaza, Francois Zakamanana, Stela Nomenjanahary, Rostant Y. Andriamalala, Kathryn M. Everson, Audrey D. Law, Luke Moe, Patricia C. Wright, David W. Weisrock (2022) Significant effects of host dietary guild and phylogeny in wild lemur gut microbiomes. ISME Commun. 2(1):33.  https://doi.org/10.1038/s43705-022-00115-6

    James G. DuBose (2025) The effects of host phylogenetic coverage and congruence metric on Monte Carlo-based null models of phylosymbiosis. bioRxiv, ver.4 peer-reviewed and recommended by PCI Evolutionary Biology https://doi.org/10.1101/2025.02.07.637028

    Natalie J. Graham, Nicola J. Day, Gancho Slavov, Peter Ritchie, Steve A. Wakelin (2025) Evidence of phylosymbiosis in the microbiome of conifer roots. Phytobiomes J. 9(4):541-57. https://doi.org/10.1094/PBIOMES-03-25-0022-R

    Mathieu Groussin, Florent Mazel, Jon G Sanders, Chris S Smillie, Sebastien Lavergne, Wilfried Thuiller, Eric J Alm (2017) Unraveling the processes shaping mammalian gut microbiomes over evolutionary time. Nat Commun. 8:14319. https://doi.org/10.1038/ncomms14319

    Kevin D. Kohl (2020) Ecological and evolutionary mechanisms underlying patterns of phylosymbiosis in host-associated microbial communities. Philos Trans R Soc Lond B Biol Sci. 375(1798):20190251. https://doi.org/10.1098/rstb.2019.0251

    Shen J. Lim, Seth R Bordenstein (2020) An introduction to phylosymbiosis. Proc Biol Sci. 287(1922):20192900. https://doi.org/10.1098/rspb.2019.2900

    Benoit Perez-Lamarque, Guilhem Sommeria-Klein, Lorena Duret, Helene Morlon (2023) Phylogenetic comparative approach reveals evolutionary conservatism, ancestral composition, and integration of vertebrate gut microbiota. Mol Biol Evol. 40(7):msad144. https://doi.org/10.1093/molbev/msad144

    Man Qin, Jing Chen, Shifen Xu, Liyun Jiang, Gexia Qiao (2021) Microbiota associated with Mollitrichosiphum aphids (Hemiptera: Aphididae: Greenideinae): diversity, host species specificity and phylosymbiosis. Environ Microbiol. 23(4):2184-2198. https://doi.org/10.1111/1462-2920.15391

    Brian K. Trevelline, Jahree Sosa, Barry K Hartup, Kevin D Kohl (2020) A bird's-eye view of phylosymbiosis: weak signatures of phylosymbiosis among all 15 species of cranes. Proc Biol Sci. 287(1923):20192988. https://doi.org/10.1098/rspb.2019.2988