DeepKin: Predicting relatedness from low-coverage genomes and paleogenomes with convolutional neural networks

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

DeepKin is a novel tool designed to predict relatedness from genomic data using convolutional neural networks (CNNs). Traditional methods for estimating relatedness often struggle when genomic data is limited, as with paleogenomes and degraded forensic samples. DeepKin addresses this challenge by leveraging two CNN models trained on simulated genomic data to classify relatedness up to the third-degree and to identify parent-offspring and sibling pairs. Our benchmarking shows DeepKin performs comparably or better than the widely used tool READv2. We validated DeepKin on empirical paleogenomes from two paleological sites, demonstrating its robustness and adaptability across different genetic backgrounds, with accuracy >90% above 10K shared SNPs. By capturing information across genomic segments, DeepKin offers a new methodological path for relatedness estimation in settings with highly degraded samples, with applications in ancient DNA, as well as forensic and conservation genetics.

Article activity feed