MADRe: Strain-Level Metagenomic Classification Through Assembly-Driven Database Reduction

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Strain-level metagenomic classification is essential for understanding microbial diversity and functional potential, but remains challenging, particularly in the absence of prior knowledge about the composition of the sample. In this paper we present MADRe, a modular and scalable pipeline for long-read strain-level metagenomic classification, enhanced with M etagenome A ssembly-Driven D atabase Re duction. MADRe combines long-read metagenome assembly, contig-to-reference mapping reassignment based on an expectation-maximization algorithm for database reduction, and probabilistic read mapping reassignment to achieve sensitive and precise classification. We extensively evaluated MADRe on simulated datasets, mock communities, and a real anaerobic digester sludge metagenome, demonstrating that it consistently outperforms existing tools by achieving higher precision with reduced false positives. MADRe’s design allows users to apply either the database reduction or read classification step individually. Using only the read classification step shows results on par with other tested tools. MADRe is open source and publicly available at https://github.com/lbcb-sci/MADRe .

Article activity feed