Bioinformatics of combined nuclear and mitochondrial phylogenomics to define key nodes for the classification of Coleoptera
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Nuclear genome sequencing is resource-intensive and not practical for building densely sampled phylogenetic trees of the most species rich lineages of animals, while mitochondrial genomes can be sequenced and analysed with relative ease. Here, we develop a conceptual approach and bioinformatics workflow for combining nuclear single-copy orthologs with less informative but densely sampled mitochondrial genomes, for a detailed tree of Coleoptera (beetles). Basal relationships of Coleoptera were first inferred from >2,000 BUSCO loci mined from GenBank’s Short Read Archive for 119 exemplars of all major lineages under various substitution models and levels of matrix completion, to reveal universally supported nodes. Second, the corresponding mitogenomes were extracted and combined with an additional 373 species selected for broad taxonomic and biogeographic coverage, roughly in proportion to the known global species diversity of Coleoptera. Bioinformatic processing of mitogenomes was conducted with a novel pipeline for rapid, accurate annotation of protein-coding genes. Finally, phylogenetic trees from all 492 mitogenomes were generated under a backbone constraint from the universal basal nodes, which produced a well-supported tree of the major lineages at family and superfamily level. Being genetically unlinked and showing unique character variation, mitogenomes provide a unique perspective of the phylogeny. Comparison with three recent nuclear phylogenomic studies resulted in the recognition of >80 nodes universally present across all analyses. These may now support the higher classification of Coleoptera and serve as backbone of further studies, as numerous full mitogenomes and mitochondrial DNA barcodes are added to an increasingly complete phylogenetic tree of this super-diverse insect order.