Evaluation of methods for estimating coalescence times using ancestral recombination graphs

Débora Y. C. Brandt
Xinzhu Wei
Yun Deng
Andrew H Vaughn
Rasmus Nielsen

Read the full article

Listed in

Castedo's selected articles (CastedoEllerman)

Abstract

The ancestral recombination graph is a structure that describes the joint genealogies of sampled DNA sequences along the genome. Recent computational methods have made impressive progress toward scalably estimating whole-genome genealogies. In addition to inferring the ancestral recombination graph, some of these methods can also provide ancestral recombination graphs sampled from a defined posterior distribution. Obtaining good samples of ancestral recombination graphs is crucial for quantifying statistical uncertainty and for estimating population genetic parameters such as effective population size, mutation rate, and allele age. Here, we use standard neutral coalescent simulations to benchmark the estimates of pairwise coalescence times from 3 popular ancestral recombination graph inference programs: ARGweaver, Relate, and tsinfer+tsdate. We compare (1) the true coalescence times to the inferred times at each locus; (2) the distribution of coalescence times across all loci to the expected exponential distribution; (3) whether the sampled coalescence times have the properties expected of a valid posterior distribution. We find that inferred coalescence times at each locus are most accurate in ARGweaver, and often more accurate in Relate than in tsinfer+tsdate. However, all 3 methods tend to overestimate small coalescence times and underestimate large ones. Lastly, the posterior distribution of ARGweaver is closer to the expected posterior distribution than Relate’s, but this higher accuracy comes at a substantial trade-off in scalability. The best choice of method will depend on the number and length of input sequences and on the goal of downstream analyses, and we provide guidelines for the best practices.

Version published to 10.1093/genetics/iyac044
Mar 25, 2022
Version published to 10.1101/2021.11.15.468686v4 on bioRxiv
Mar 23, 2022
Version published to 10.1101/2021.11.15.468686v3 on bioRxiv
Feb 22, 2022
Version published to 10.1101/2021.11.15.468686v2 on bioRxiv
Nov 27, 2021
Version published to 10.1101/2021.11.15.468686v1 on bioRxiv
Nov 17, 2021

Tspecies , Rapid Optimization for Estimating Species Divergence Time Using K _s Distribution

This article has 4 authors:
1. Mi-Jia Li
2. Xiao-Xue Li
3. Lin-Lin Xu
4. Bo-Wen Zhang
This article has no evaluationsLatest version May 23, 2025
Coalescence and Translation: A Language Model for Population Genetics

This article has 5 authors:
1. Kevin Korfmann
2. Nathaniel S. Pope
3. Melinda Meleghy
4. Auélien Tellier
5. Andrew D. Kern
This article has no evaluationsLatest version Jun 27, 2025
Sampling Aware Ancestral State Inference

This article has 4 authors:
1. Yexuan Song
2. Ivan Gill
3. Ailene MacPherson
4. Caroline Colijn
This article has no evaluationsLatest version May 23, 2025

Evaluation of methods for estimating coalescence times using ancestral recombination graphs

Listed in

Abstract

Article activity feed

Tspecies , Rapid Optimization for Estimating Species Divergence Time Using K _s Distribution

Coalescence and Translation: A Language Model for Population Genetics

Sampling Aware Ancestral State Inference

Listed in

Abstract

Article activity feed

Related articles

Tspecies , Rapid Optimization for Estimating Species Divergence Time Using K s Distribution

Coalescence and Translation: A Language Model for Population Genetics

Sampling Aware Ancestral State Inference

Tspecies , Rapid Optimization for Estimating Species Divergence Time Using K _s Distribution