Beyond Level-1: Fast Inference of Generic Semi-directed Phylogenetic Networks
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Phylogenetic networks capture reticulate evolution, but existing methods have mostly been restricted to level-1 topologies. This restriction severely limits the biological applicability of phylogenetic network inference. Here, we extend the widely used SNaQ method to scalably infer arbitrary binary, metric, semi-directed phylogenetic networks while allowing optional restriction to a user-specified net-work space. We implement computational improvements that yield substantial speedups in composite-likelihood evaluation, opening the door to genome-scale studies of hybridization, introgression, and horizontal gene transfer under a com-posite likelihood framework for the first time. Guided by recent identifiability results, we restrict SNaQ’s search space to tree-child and galled networks (TCG) and assess SNaQ’s ability to accurately infer networks that fall both inside and outside of this space. In these simulations, SNaQ reliably recovers TCG net-works under diverse conditions, and still recovers meaningful information about hybridizations even when the phylogeny is not correctly inferred. Finally, we analyze the phylogeny of Xiphophorus (Poeciliidae) and recover network models that fit the data significantly better than previously inferred level-1 networks, revealing a history with more hybridization events than previously depicted by level-1 networks. By enabling scalable inference beyond level-1 networks, our work facilitates the reconstruction of far richer reticulate histories from genomic data, bringing phylogenetic analysis closer to capturing the full network of life.