Reliable inference of phylogenomic relationship via assembly-based strategy accommodating raw reads and proteins

Yunlong Li
Xu Liu
Chong Chen
Jian-Wen Qiu
Kevin Kocot
Jin Sun

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Phylogenomics has emerged as a transformative approach in systematics, conservation biology, and biomedicine, enabling the inference of evolutionary relationships by leveraging hundreds to thousands of genes from genomic or transcriptomic data. However, acquiring high-quality genomes and transcriptomes necessitates samples with intact DNA and RNA, substantial sequencing investments, and extensive bioinformatic processing, such as genome/transcriptome assembly and annotation. This challenge is particularly pronounced for rare or difficult-to-collect species, such as those inhabiting the deep sea, where only fragmented DNA reads are often available due to environmental degradation or suboptimal preservation conditions. To address these limitations, we introduce VEHoP (Versatile, Easy-to-use Homology-based Phylogenomic pipeline), a tool designed to infer protein-coding regions from diverse inputs, including raw reads (short and long), draft genomes, transcriptomes, and annotated genomes. VEHoP automates the generation of orthologous sequence alignments, concatenated matrices, and phylogenetic trees, streamlining phylogenomic analyses for researchers across disciplines. The tool aims to (1) expand taxonomic sampling by accommodating a wide range of input data types and (2) simplify phylogenomic workflows, making them accessible to researchers with varying levels of bioinformatic expertise. We evaluated VEHoP’s performance using datasets from oysters, catfish, and insects, demonstrating its ability to produce robust phylogenetic trees with strong bootstrap support, outperforming assembly-free methods. Additionally, we applied VEHoP to reconstruct the phylogeny of the enigmatic deep-sea gastropod order Neomphalida, successfully resolving a well-supported phylogenetic backbone for this poorly understood group. VEHoP is freely available on GitHub ( https://github.com/ylify/VEHoP ), with dependencies easily installable via Bioconda.

Version published to 10.1101/2024.07.24.604968 on bioRxiv
Jul 24, 2024

META-DIFF: a k-mer-based pipeline that detects differentially abundant sequences in metagenomics whole genome sequencing

This article has 8 authors:
1. Louis-Maël Guéguen
2. Alban Mathieu
3. Simon Pelletier
4. Anthony Woo
5. Namita Misra
6. Magali Moreau
7. Olivier Perin
8. Arnaud Droit
This article has no evaluationsLatest version Jan 29, 2026
pynnotate: a flexible tool for retrieving and processing GenBank data in molecular evolution research and education

This article has 4 authors:
1. Fernanda Caron
2. Felipe Magalhães
3. Matheus Salles
4. Fabricius Domingos
This article has no evaluationsLatest version Feb 26, 2026
Shotgun metagenomics: a deep insight into the composition and function of the complex microbial world

This article has 7 authors:
1. Grazia Visci
2. Elisabetta Notario
3. Giuseppe Defazio
4. Mariano Francesco Caratozzolo
5. Bruno Fosso
6. Marinella Marzano
7. Graziano Pesole
This article has no evaluationsLatest version Jan 30, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

META-DIFF: a k-mer-based pipeline that detects differentially abundant sequences in metagenomics whole genome sequencing

pynnotate: a flexible tool for retrieving and processing GenBank data in molecular evolution research and education

Shotgun metagenomics: a deep insight into the composition and function of the complex microbial world