Horizontal Gene Transfer Inference: Gene presence-absence outperforms gene trees
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Horizontal Gene Transfer (HGT) is a major driving force in prokaryotic evolution. There are a range of methods for HGT inference, but they are seldom systematically compared with each other. This is due to the validation problem associated with HGT inference. It is impossible to go back in time and know the true HGT events that we infer today, that too after the effect of other evolutionary processes. In the absence of such a benchmark, inference methods are often validated using simulated data. Simulations may not accurately reflect reality. They can be biased by our current understanding of evolution and by simplifying assumptions. The assumptions made in the inference model may also get incorporated into the simulation. Here, we perform a large-scale comparison of HGT inference methods using a common dataset of bacterial genomic data. We focus on how they infer clusters of neighboring co-acquisitions and co-transferred genes, as well as the inference of HGT experienced by genes associated with environmental response. Our analysis reveals that implicit phylogenetic methods, which depend on gene presence-absence profiles rather than gene trees, demonstrate superior performance in inferring meaningful HGT events. Our findings underscore the importance of selecting an appropriate inference method to accurately infer HGT. We offer practical recommendations to guide the choice of such methods and establish a comprehensive benchmark, which can inform future development of HGT inference approaches.