Worth the fuss? Maximising informativeness for target capture-based phylogenomics in Erica (Ericaceae)
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Plant phylogenetics has been revolutionised in the genomic era, with target capture acting as the primary workhorse of most recent research in the new field of phylogenomics. Target capture (aka Hyb-Seq) allows researchers to sequence hundreds to thousands of genomic regions (loci) of their choosing, at relatively low cost per sample, from which to derive phylogenetically informative data. Although this highly flexible and widely applicable method has rightly earned its place as the field’s de facto standard, it does not come without its challenges. In particular, users have to specify which loci to sequence—a surprisingly difficult task, especially when working with non-model groups, as it requires pre-existing genomic resources in the form of assembled genomes and/or transcriptomes. In the absence of taxon-specific genomic resources, target sets exist that are designed to work across broad taxonomic scales. However, the highly conserved loci that they target may lack informativeness for difficult phylogenetic problems, such as that presented by the rapid radiation of Erica in southern Africa. In such cases, a “made to measure” approach may prove more fruitful. We set out to design a target set capable of resolving the Erica phylogeny while maximising informativeness, minimising paralogy, and allowing for comparability with other target sets. The result was a target set comprising just over 300 genes with excellent recovery rates across roughly 90 Erica species as well as outgroups from the genera Calluna , Daboecia , and Rhododendron . The targets had high information content as measured by parsimony informative sites and Quartet Internode Resolution Probability (QIRP) at shallow nodes. Target recovery was not negatively affected by the inclusion of introns, and phylogenetic informativeness was positively correlated with intron content. Notably, we show that specifically targeting introns, as opposed to recovering them via exon-flanking “bycatch”, can substantially improve the utility of a target set. Overall, our results show the value of investing in building a made to measure target set, and we provide a suite of open-source tools that can be used to replicate our approach in other groups (github.com/SethMusker/TargetVet).