Identifiability of Phylogenetic Level-2 Networks under the Jukes-Cantor Model
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We investigate which evolutionary histories can potentially be reconstructed from sufficiently long DNA sequences by studying the identifiability of phylogenetic networks from sequence data generated under site independent models of molecular evolution. While previous work in the field has established the identifiability of phylogenetic trees and level-1 networks, networks with non-overlapping reticulation cycles, less is known about more complex network structures. In this work, we extend identifiability results to network classes that include pairs of tangled reticulations.
Our main result shows that binary semi-directed level-2 phylogenetic networks are generically identifiable under the Jukes–Cantor model, provided they are triangle-free and strongly tree-child. We also strengthen existing identifiability results for level-1 networks, showing that the number of reticulation nodes is generically identifiable under the Jukes-Cantor model.
In addition, we present more general identifiability results that do not restrict the network level at all and hold for the Jukes-Cantor as well as for the Kimura-2-Parameter model. Specifically, we demonstrate that any two binary semi-directed networks that display different sets of 4-leaf subtrees (quartets) are distinguishable. This has direct implications for the identifiability of a network’s reticulated components (blobs). We show that the tree-of-blobs of a network, the global branching structure of the network, is identifiable, as well as the circular ordering of the subnetworks around each blob, for networks in which edges do not cross and taxa are on the outside.