Efficient High-Quality Metagenome Assembly from Long Accurate Reads using Minimizer-space de Bruijn Graphs

Gaëtan Benoit
Sébastien Raguideau
Robert James
Adam M. Phillippy
Rayan Chikhi
Christopher Quince

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (Arcadia Science)

Abstract

We introduce a novel metagenomics assembler for high-accuracy long reads. Our approach, implemented as metaMDBG, combines highly efficient de Bruijn graph assembly in minimizer space, with both a multi- k ′ approach for dealing with variations in genome coverage depth and an abundance-based filtering strategy for simplifying strain complexity. The resulting algorithm is more efficient than the state-of-the-art but with better assembly results. metaMDBG was 1.5 to 12 times faster than competing assemblers and requires between one-tenth and one-thirtieth of the memory across a range of data sets. We obtained up to twice as many high-quality circularised prokaryotic metagenome assembled genomes (MAGs) on the most complex communities, and a better recovery of viruses and plasmids. metaMDBG performs particularly well for abundant organisms whilst being robust to the presence of strain diversity. The result is that for the first time it is possible to efficiently reconstruct the majority of complex communities by abundance as nearcomplete MAGs.

Arcadia Science
Jul 15, 2023

Improved reconstruction of circularised phage and plasmid genomes

This is just an extra thing, but it would be interesting to see how well this tool performs at recovering genomes from eukaryotic lineages since short-read methods produce very fragmented assemblies. Some of the metagenomes in this list are from communities with eukaryotes, such as the cheese samples: https://github.com/PacificBiosciences/pb-metagenomics-tools/blob/master/docs/PacBio-Data.md

Read the original source
Arcadia Science
Jul 14, 2023

We grouped MAGs into three conventional categories based on the CheckM results: ‘near-complete’ if its completeness is ≥ 90% and its contamination is ≤ 5%, ‘high-quality’ if completeness ≥ 70% and contamination ≤ 10%, ‘medium quality’ if completeness ≥ 50% and contamination ≤ 10%.

Did you also take into consideration number of rRNAs/tRNAs into categories such as those in MIMAG/MISAG: https://www.nature.com/articles/nbt.3893?

Read the original source
Arcadia Science
Jul 14, 2023

Abstract

It would probably help to bring visibility to the tool if the link to the github repository was in the abstract

Read the original source
Version published to 10.1101/2023.07.07.548136 on bioRxiv
Jul 8, 2023

META-DIFF: a k-mer-based pipeline that detects differentially abundant sequences in metagenomics whole genome sequencing

This article has 8 authors:
1. Louis-Maël Guéguen
2. Alban Mathieu
3. Simon Pelletier
4. Anthony Woo
5. Namita Misra
6. Magali Moreau
7. Olivier Perin
8. Arnaud Droit
This article has no evaluationsLatest version Jan 29, 2026
Shotgun metagenomics: a deep insight into the composition and function of the complex microbial world

This article has 7 authors:
1. Grazia Visci
2. Elisabetta Notario
3. Giuseppe Defazio
4. Mariano Francesco Caratozzolo
5. Bruno Fosso
6. Marinella Marzano
7. Graziano Pesole
This article has no evaluationsLatest version Jan 30, 2026
Nanopore Data-Driven Near-T2T Genome Assembly of <em>Hippophae rhamnoides</em> ssp. <em>mongolica</em> Rousi

This article has 15 authors:
1. Alexander Arkhipov
2. Nadezhda Bolsheva
3. Elena Pushkova
4. Vladislav Babenko
5. Yury Zubarev
6. Vera Kovalenko
7. Fedor Kostromskoy
8. Elizaveta Ivankina
9. Ekaterina Dvorianinova
10. Nikolai Barsukov
11. Daiana Krupskaya
12. Elena Borkhert
13. Ksenia Klimina
14. Nataliya Melnikova
15. Alexey Dmitriev
This article has no evaluationsLatest version Dec 15, 2025

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

META-DIFF: a k-mer-based pipeline that detects differentially abundant sequences in metagenomics whole genome sequencing

Shotgun metagenomics: a deep insight into the composition and function of the complex microbial world

Nanopore Data-Driven Near-T2T Genome Assembly of <em>Hippophae rhamnoides</em> ssp. <em>mongolica</em> Rousi