IQ-NET: A Deep Learning Approach for Fast and Accurate Phylogenetic Inference from Real Alignments
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Phylogenetic inference is fundamental to modern biology, with applications spanning evolutionary biology, epidemiology, and comparative genomics. While maximum likelihood and Bayesian methods remain the gold standard due to their statistical rigor, they rely on simplifying evolutionary assumptions and are computationally intensive. Existing machine learning approaches offer speed advantages, but face several limitations: exclusive reliance on simulated training data, inadequate handling of gaps, focus primarily on topology rather than complete tree reconstruction, and sensitivity to input sequence order. Here, we introduce IQ-NET (Intelligent Quartet NETwork), a machine learning framework that addresses these limitations through training exclusively on real datasets, simultaneous inference of topology and branch lengths from gapped alignments without substitution model assumptions, and robustness to the order of input sequences. IQ-NET outperforms existing machine learning methods and achieves both higher accuracy and a 24-fold speedup over the IQ-TREE software. We also demonstrate IQ-NET's utility in species tree reconstruction by integrating it with ASTRAL.