Unifying population structure and relatedness analysis through a coalescent approach

Diego Veliz-Otani
Victor Borda
Heinner Guio
Omar Caceres
Cesar Sanchez
Carlos Padilla
Eimear E. Kenny
Sebastian Zollner
Timothy O’Connor

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Standard methods in genome-wide association studies (GWAS) partition genetic similarity into recent familial relationship, modeled by a genetic relationship matrix (GRM), and distant relatedness, adjusted for using principal components (PCs). This practice relies on an implicit causal model that conflates population structure with confounding. Here, we challenge this approach by developing a unified framework grounded in coalescent theory. We introduce the Coefficient of Genealogical Similarity (GeSi), a statistic derived from a model of shared derived alleles that captures the full continuum of shared ancestry and can be estimated directly from genotype data. This leads to a new classification of GRMs into “full” matrices, which capture the complete genealogy, and “shallow” matrices, which measure only recent relatedness. Systematic benchmarking demonstrates that full GRMs are sufficient to model the genetic covariance from population structure, rendering PC adjustment for this purpose redundant. This finding clarifies that the justifiable role for PCs in such a model is to correct for true environmental or complex genetic confounders. Our analyses of empirical data confirm that including PCs can improve model fit, providing evidence that such confounding is present and correlated with axes of genetic variation. This work establishes a new theoretical framework that disentangles the modeling of genealogical relatedness from the correction of confounding, reframing the role of PCs as proxies for the latter and challenging the rationale for including the top PCs merely to capture maximal genetic variance.

Version published to 10.1101/2025.10.16.682899 on bioRxiv
Oct 16, 2025

Reframing Population Genetic Structure as a Quantum Optimization Problem

This article has 1 author:
1. Andrew Davinack
This article has no evaluationsLatest version Dec 24, 2025
Genetic estimates of relatedness: Established practices and new opportunities through low coverage whole genome sequencing

This article has 8 authors:
1. Annika Freudiger
2. Natalie Kestel
3. Vladimir Jovanovic
4. Mariana Madruga de Brito
5. Angelina Ruiz-Lambides
6. Katja Nowick
7. Anja Widdig
8. Harald Ringbauer
This article has no evaluationsLatest version Jan 23, 2026
Decoding Complex Genotype-Phenotype Interactions by Discretizing the Genome

This article has 6 authors:
1. Jędrzej Kubica
2. Hetvi Jethwani
3. Krzysztof H. Banecki
4. Mauricio Moldes
5. Dariusz Plewczynski
6. Ben Busby
This article has no evaluationsLatest version Dec 17, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Reframing Population Genetic Structure as a Quantum Optimization Problem

Genetic estimates of relatedness: Established practices and new opportunities through low coverage whole genome sequencing

Decoding Complex Genotype-Phenotype Interactions by Discretizing the Genome