Benchmarking kinship estimation tools for ancient genomes using pedigree simulations

Şevval Aktürk
Igor Mapelli
Merve N. Güler
Kanat Gürün
Büşra Katırcıoğlu
Kıvılcım Vural
Ekin Sağlıcan
Mehmet Çetin
Reyhan Yaka
Elif Sürer
Gözde Atağ
Sevim Seda Çokoğlu
Arda Sevkar
N. Ezgi Altınışık
Dilek Koptekin
Mehmet Somel

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

There is growing interest in uncovering genetic kinship patterns in past societies using low-coverage paleogenomes. Here, we benchmark four tools for kinship estimation with such data: lcMLkin, NgsRelate, KIN, and READ, which differ in their input, IBD estimation methods, and statistical approaches. We used pedigree and ancient genome sequence simulations to evaluate these tools when only a limited number (1K to 50K) of shared SNPs (with minor allele frequency ≥0.01) are available. The performance of all four tools was comparable using ≥20K SNPs. We found that first-degree related pairs can be accurately classified even with 1K SNPs, with 85% F1 scores using READ and 96% using NgsRelate or lcMLkin. Distinguishing third-degree relatives from unrelated pairs or second-degree relatives was also possible with high accuracy (F1 >90%) with 5K SNPs using NgsRelate and lcMLkin, while READ and KIN showed lower success (69% and 79%, respectively). Meanwhile, noise in population allele frequencies and inbreeding (first cousin mating) led to deviations in kinship coefficients, with different sensitivities across tools. We conclude that using multiple tools in parallel might be an effective approach to achieve robust estimates on ultra-low coverage genomes.

Version published to 10.22541/au.171011990.00781504/v1
Mar 11, 2024

Genetic estimates of relatedness: Established practices and new opportunities through low coverage whole genome sequencing

This article has 8 authors:
1. Annika Freudiger
2. Natalie Kestel
3. Vladimir Jovanovic
4. Mariana Madruga de Brito
5. Angelina Ruiz-Lambides
6. Katja Nowick
7. Anja Widdig
8. Harald Ringbauer
This article has no evaluationsLatest version Jan 23, 2026
Genetic diversity in the Criollo Argentino horse from SNP array data

This article has 6 authors:
1. Claudia Corbi-Botto
2. María Eugenia Zappa
3. Sebastián Andrés Sadaba
4. Pilar Peral-García
5. Guillermo Giovambattista
6. Silvina Díaz
This article has no evaluationsLatest version Jan 20, 2026
Comparison of BLUPF90IOD3 and MiXBLUP implementations of the single-step model applied to the Polish national dairy cattle evaluation

This article has 4 authors:
1. Dawid Słomian
2. Michalina Jakimowicz
3. Tomasz Suchocki
4. Joanna Szyda
This article has no evaluationsLatest version Dec 22, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Genetic estimates of relatedness: Established practices and new opportunities through low coverage whole genome sequencing

Genetic diversity in the Criollo Argentino horse from SNP array data

Comparison of BLUPF90IOD3 and MiXBLUP implementations of the single-step model applied to the Polish national dairy cattle evaluation