Bayesian model-averaging of parametric coalescent models for phylodynamic inference
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Bayesian phylodynamic models have become essential for reconstructing population history from genetic data, yet their accuracy depends crucially on choosing appropriate demographic models. To address uncertainty in model choice, we introduce a Bayesian Model Averaging (BMA) framework that integrates multiple parametric coalescent models--including constant, exponential, logistic, and Gompertz growth--along with their ''expansion'' variants that account for non-zero ancestral populations. Implemented in a Bayesian setting with Metropolis-coupled MCMC, this approach allows the sampler to switch among candidate growth functions, thereby capturing demographic histories without having to pre-specify a single model. Simulation studies verify that the logistic and Gompertz models may require specialised sampling strategies such as adaptive multivariate proposals to achieve robust mixing. We demonstrate the performance of these models on datasets simulated under different substitution models, and show that joint inference of genealogy and population parameters is well-calibrated when properly incorporating correlated-move operators and BMA. We then apply this method to two real-world datasets. Analysis of Egyptian Hepatitis C virus (HCV) sequences indicates that models with a founder population followed by a rapid expansion are well supported, with a slight preference for Gompertz-like expansions. Our analysis of a single metastatic colorectal cancer (CRC) single-cell dataset suggests that exponential-like growth is plausible even for an advanced stage cancer patient. We believe this highlights that tumour subclones may retain substantial proliferative capacity into the later stages of the disease. Overall, our unified BMA framework reduces the need for restrictive model selection procedures and can also provide deeper biological insights into epidemic spread and tumour evolution. By systematically integrating multiple growth hypotheses within a standard Bayesian setting, this approach naturally avoids overfitting and offers a powerful tool for inferring population histories across diverse biological domains.