Learning to Explore Tree Neighbourhoods for Phylogenetic Inference

Federico Julian Camerota Verdù
Andrea Gasparin
Luca Bortolussi
Lorenzo Castelli

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background: Phylogenetic inference is a key challenge in computational biology, with applications ranging from evolutionary analysis to comparative genomics. The Balanced Minimum Evolution Problem (BMEP) offers a well-established formulation of this problem, but remains computationally intractable for large instances. Results: In this work, we propose a reinforcement learning (RL) framework to tackle the BMEP through local search in the space of phylogenetic trees.Our contributions are threefold: (1) we introduce an improved RL formulation tailored to the structure of phylogenetic inference in the context of the BMEP; (2) we train an RL agent capable of solving instances with up to 100 taxa; and (3) we investigate the generalization capabilities of the learned policy across different substitution models, instance sizes, and datasets.To address the limitations of relying solely on the learned policy at inference time, we integrate it into a novel search-based framework that enables effective adaptation during evaluation. Conclusions: Experimental results show that our method outperforms greedy heuristics and matches the performance of state-of-the-art algorithms for the BMEP.When tested under significant distributional shifts, we greatly reduce the gap with state-of-the-art algorithms. This demonstrates the potential of RL applications to phylogenetic inference.

Version published to 10.21203/rs.3.rs-6809300/v1 on Research Square
Jun 25, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed