PSMC-FAC: Automated Optimization of False-Negative Rate Corrections for Low-Coverage PSMC-Based Demographic Inference

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Inferring demographic history from whole-genome data is a central objective in evolutionary and conservation genomics. However, the Pairwise Sequentially Markovian Coalescent (PSMC) framework, one of the most widely used demographic inference methods for whole-genome sequence data, is highly sensitive to sequencing coverage, with low coverage producing systematic underestimation of heterozygosity, which biases effective population size trajectories. Here, we present PSMC-FAC, an automated method designed to optimize false-negative rate correction in low-coverage genomes by minimizing geometric distances between FNR-corrected low-coverage trajectories and their corresponding high-coverage references. Whole-genome datasets from humans, gray wolves, and cattle were downsampled across multiple coverage levels and processed through standard demographic inference pipelines. Corrected trajectories, projected onto a common temporal grid, were compared using Hausdorff and discrete Fréchet distance metrics and optimal correction factors were modeled as a function of sequencing depth using second-degree polynomial regression. Across species and demographic contexts, PSMC-FAC substantially improved concordance between low- and high-coverage trajectories and revealed highly predictable coverage-dependent correction patterns. Overall, PSMC-FAC provides a reproducible and mathematically grounded alternative to subjective correction approaches, enabling reliable demographic inference from moderate-coverage genomes and facilitating broader population-scale genomic analyses.

Article activity feed