AdDeam : a fast and scalable tool for estimating and clustering reference-level damage profiles

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Motivation

DNA damage patterns, such as increased frequencies of C→T and G→A substitutions at fragment ends, are widely used in ancient DNA studies to assess authenticity and detect contamination. In metagenomic studies, fragments can be mapped against multiple references or de novo assembled contigs to identify those likely to be ancient. Generating and comparing damage profiles, however, can be both tedious and time-consuming. Although tools exist for estimating damage in single reference genomes and metagenomic datasets, none efficiently cluster damage patterns.

Results

To address this methodological gap, we developed AdDeam, a tool that combines rapid damage estimation with clustering for streamlined analyses and easy identification of potential contaminants or outliers. Our tool takes aligned ancient DNA (aDNA) fragments from various samples or contigs as input, computes damage patterns, clusters them, and outputs representative damage profiles per cluster, a probability of each sample pertaining to a cluster, as well as a Principal Component Analysis of the damage patterns for each sample for fast visualisation. We evaluated AdDeam on both simulated and empirical datasets. AdDeam effectively distinguishes different damage levels, such as uracil-DNA glycosylase-treated samples, sample-specific damages from specimens of different time periods, and can also distinguish between contigs containing modern or ancient fragments, providing a clear framework for aDNA authentication and facilitating large-scale analyses.

Availability and Implementation

AdDeam is publicly available at https://github.com/LouisPwr/AdDeam and can also be installed via Bioconda. It is implemented in Python and C++. All analysis scripts and datasets are available at https://github.com/LouisPwr/AdDeamAnalysis and on Zenodo under: 10.5281/zenodo.15052427.

Article activity feed