Reconstructing the network of horizontal gene exchange in bacteria to differentiate direct and indirect transfers
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Horizontal gene transfer (HGT) plays a central role in bacterial evolution. Yet, its large-scale dynamics and underlying network structure remain poorly characterized. We present a theoretical framework that models HGT as a continuous stochastic process over a network of bacterial genera and analyze its genomic footprint via the distribution of exact sequence matches shared across taxa---the match length distribution (MLD). We show that different evolutionary regimes imprint distinct statistical signatures on the MLD: single episodic gene transfer events yield exponential distributions, while continuous sustained HGT processes lead to power-law tails. The power-law exponent is analytically linked to the topology of the transfer network, distinguishing between intra-clade transfers and hub-mediated dissemination. Empirical MLDs derived from bacterial genomes recapitulate these predicted patterns. Moreover, we find that defining a genus-specific "transferability" parameter that governs pairwise HGT rates and incorporating a high-transferability hub accurately reproduces the observed data. Our approach provides a general framework for inferring hidden structure in genomic horizontal transfer processes, enabling quantitative analysis of microbial evolution.