Hydroplane I: one-shot probabilistic evolutionary analysis for scalable organizational identification

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Applying evolutionary genomics to microbial and viral community sequence information presents significant challenges. Metagenomic sequences are typically stored in large databases of short-read fragments with unknown relationships. Available tools for analysis are often slow, rely on incomplete subsets of data, or focus on narrowly defined sub-problems. Moreover, existing tools often depend on simplistic model assumptions or treat inferences as empirical data, which can distort downstream analyses. In this work, we present theory and initial validation of the first stage of a fast, one-shot Bayesian approach to metagenomic evolutionary analysis. Our primary result here demonstrates effective use of simple models in experimental design using collections of short proximal short oligonucleotide sequences ( kmers ) to detect probable homologs, and validates the speed, sensitivity and specificity of this approach in a test set of whole Prochlorococcus genome sequences.

Significance Statement

The Hydroplane algorithm provides an efficient framework for analyzing co-evolutionary relationships among microbial and viral genomes using metagenomic data. Hydroplane guides the identification of homologous regions, estimates lineage diversity, and reveals evolutionary events such as mutation, divergence, selection, and horizontal gene transfer. Its computationally lightweight design supports scalable genome clustering and adaptation analyses, enabling the study of rare microbial and viral genomes across diverse ecological contexts. With an accessible implementation, Hydroplane broadens the scope of evolutionary genomics research, offering critical insights into host-pathogen dynamics and supporting biopreparedness through the study of microbial and viral genetic relationships.

Article activity feed