Accelerated Bayesian inference of population size history from recombining sequence data

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

I present phlash , a new Bayesian method for inferring population history from whole genome sequence data. phlash is p opulation h istory l earning by a veraging s ampled h istories: it works by drawing random, low-dimensional projections of the coalescent intensity function from the posterior distribution of a psmc -like model, and averaging them together to form an accurate and adaptive size history estimator. On simulated data, phlash tends to be faster and have lower error than several competing methods including smc ++, msmc 2, and F it C oal . Moreover, it provides a full posterior distribution over population size history, leading to automatic uncertainty quantification of the point estimates, as well to new Bayesian testing procedures for detecting population structure and ancient bottlenecks. On the technical side, the key advance is a novel algorithm for computing the score function (gradient of the log-likelihood) of a coalescent hidden Markov model: when there are M hidden states, the algorithm requires. 𝒪( M 2 ) time and. 𝒪(1) memory per decoded position, the same cost as evaluating the log-likelihood itself using the naïve forward algorithm. This algorithm is combined with a hand-tuned implementation that fully leverages the power of modern GPU hardware, and the entire method has been released as an easy-to-use Python software package.

Article activity feed