Deep learning for mass extinction detection on fossilized phylogenies: power, limitations, and lessons for simulation-based phylodynamic inference

Minghao Du
Wenhui Wang
Jingqiang Tan
Joëlle Barido-Sottani

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Detecting mass extinction events from phylogenies is a fundamental yet challenging task. While traditional likelihood-based methods are available, deep learning offers a powerful, simulation-based alternative. Here, we evaluate a deep learning approach using a novel hybrid model that combines Graph Neural Networks with Long Short-Term Memory networks. This model analyzes phylogenies—containing both extant species and fossils—simulated under a complex skyline Fossilized Birth-Death model that incorporates mass extinctions and fluctuating background rates. We validate the architecture's effectiveness through ablation studies. Our investigation revealed that the stochasticity of the simulation was a primary obstacle, creating significant "label noise" that initially limited performance. A direct comparison showed our deep learning approach performed slightly better than Bayesian methods. It is robust to uncertainty in phylogenetic branch lengths and topology and generalizes to larger trees, but its performance degrades under model mismatch with higher background extinction rates. However, our work highlights a critical limitation: the model is highly specific to the definition of mass extinction it was trained on. Consequently, any modification to this definition necessitates retraining a new model from scratch. We conclude by summarizing the challenges and lessons learned for simulation-based inference in phylodynamics.

Version published to 10.1101/2025.10.20.683352 on bioRxiv
Oct 20, 2025

Evaluating the impact and detectability of mass extinctions on total-evidence dating

This article has 4 authors:
1. Minghao Du
2. Wenhui Wang
3. Jingqiang Tan
4. Joëlle Barido-Sottani
This article has no evaluationsLatest version Sep 30, 2025
On the utility of Deep Learning for model classification and parameter estimation on complex diversification scenarios

This article has 5 authors:
1. P.G. Peña
2. G. Iglesias
3. E. Talavera
4. AS. Meseguer
5. I. Sanmartín
Reviewed by Arcadia Science

This article has 1 evaluationAppears in 1 listLatest version Aug 27, 2025Latest activity Sep 5, 2025
Accurate and efficient phylogenetic inference through end-to-end deep learning

This article has 5 authors:
1. Xinru Zhang
2. Shizhe Ding
3. Chungong Yu
4. Jianquan Zhao
5. Dongbo Bu
This article has no evaluationsLatest version Oct 2, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Evaluating the impact and detectability of mass extinctions on total-evidence dating

On the utility of Deep Learning for model classification and parameter estimation on complex diversification scenarios

Accurate and efficient phylogenetic inference through end-to-end deep learning