GhostParser: A highly scalable phylogenomic approach for the identification of ghost introgression

Ethan R. Tolman
Anton Suvorov

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

A growing body of empirical research shows that interspecific gene flow is a widespread biological force that shapes evolutionary histories across the Tree of Life. Computational approaches designed to detect introgression either employ full likelihood, including Bayesian, frameworks to directly estimate phylogenetic networks or utilize summary statistics derived directly from locus sequence alignments or estimated gene trees to map gene flow events onto the species tree. Many current methods currently have major shortcomings. The computationally scalable summary statistics and pseudo-likelihood-based techniques may provide erroneous results in the presence of so-called “ghost” introgression and rate variation between lineages. On the other hand, full likelihood methods are more accurate, but are not computationally tractable for large phylogenomic datasets. Here, we develop a novel summary statistic, based on tree heights of different gene tree topologies to reliably distinguish between sampled and ghost introgression events. We implemented this approach in the publicly accessible bioinformatic pipeline “GhostParser”. We demonstrate that GhostParser can accurately distinguish between scenarios of sampled and ghost introgression, even in the presence of rate variation between lineages. Our methodology generally concurs in accuracy with the full likelihood software Bayesian Phylogenetics and Phylogeography (BPP) on empirical datasets, and outperforms BPP in our simulation conditions, both in a small fraction of the computational time. We show that GhostParser is a scalable tool for the identification of different introgression patterns in phylogenomic datasets.

Version published to 10.1101/2025.08.21.671585 on bioRxiv
Aug 22, 2025

Introgression facilitates rapid evolution of Galápagos tree finches

This article has 8 authors:
1. Matteo Sebastianelli Arbelaez
2. Erik Enbody
3. Carl-Johan Rubin
4. Carlos Valle
5. Lukas Keller
6. Rosemary Grant
7. Peter Grant
8. Leif Andersson
This article has no evaluationsLatest version Jan 21, 2026
Rapid Phylogenomic Analysis of Thousands Outbreak‐Causing Viral Genomes Using Covary

This article has 1 author:
1. Marvin I. De los Santos
This article has no evaluationsLatest version Dec 22, 2025
The weak driver conundrum: data archiving and biological phenomena impact macrogenetic findings

This article has 2 authors:
1. Ivo Colmonero-Costeira
2. Deborah Leigh
This article has no evaluationsLatest version Dec 10, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Introgression facilitates rapid evolution of Galápagos tree finches

Rapid Phylogenomic Analysis of Thousands Outbreak‐Causing Viral Genomes Using Covary

The weak driver conundrum: data archiving and biological phenomena impact macrogenetic findings