Squirrel: Reconstructing semi-directed phylogenetic level-1 networks from four-leaved networks or sequence alignments
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Phylogenetic networks model the evolutionary history of taxa while allowing for reticulate events such as hybridization and horizontal gene transfer. As is the case for phylogenetic trees, it is often not possible to infer the root location of such a network directly from biological data for several evolutionary models. Hence, we consider semi-directed (phylogenetic) networks: partially directed graphs without a root in which the directed edges represent reticulate evolutionary events. By specifying a known outgroup, the rooted topology can be recovered from such networks. We introduce the algorithm S quirrel ( S emi-directed Quarnet-based Inference to Reconstruct Level-1 Networks) which constructs a semi-directed level-1 network from a full set of quarnets (four-leaf semi-directed networks). Our method also includes a heuristic to construct such a quarnet set directly from sequence alignments. To build a network from quarnets, S quirrel first builds a tree, after which it repeatedly solves the T ravelling S alesman P roblem (TSP) to replace each high-degree vertex by a cycle. We demonstrate S quirrel ’s performance on randomly generated networks and on real sequence data sets, the largest of which contains 29 aligned sequences close to 1.7 Mpb long. The resulting networks are obtained on a standard laptop within a few minutes. Lastly, we prove that S quirrel is combinatorially consistent: given a full set of quarnets coming from a triangle-free semi-directed level-1 network, it is guaranteed to reconstruct the original network. S quirrel is implemented in Python, has an easy-to-use graphical user-interface that takes sequence alignments or quarnets as input, and is freely available at https://github.com/nholtgrefe/squirrel .