Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets

Nicholas D. Youngblut
Ruth E. Ley

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (PeerJ)

Abstract

Mapping metagenome reads to reference databases is the standard approach for assessing microbial taxonomic and functional diversity from metagenomic data. However, public reference databases often lack recently generated genomic data such as metagenome-assembled genomes (MAGs), which can limit the sensitivity of read-mapping approaches. We previously developed the Struo pipeline in order to provide a straight-forward method for constructing custom databases; however, the pipeline does not scale well with the ever-increasing number of publicly available microbial genomes. Moreover, the pipeline does not allow for efficient database updating as new data are generated. To address these issues, we developed Struo2, which is >3.5-fold faster than Struo at database generation and can also efficiently update existing databases. We also provide custom Kraken2, Bracken, and HUMAnN3 databases that can be easily updated with new genomes and/or individual gene sequences. Struo2 enables feasible database generation for continually increasing large-scale genomic datasets.

Availability

Struo2: https://github.com/leylabmpi/Struo2
Pre-built databases: http://ftp.tue.mpg.de/ebio/projects/struo2/
Utility tools: https://github.com/nick-youngblut/gtdb_to_taxdump

PeerJ
Sep 16, 2021

Read the original source
PeerJ
Sep 16, 2021

Read the original source
Version published to 10.1101/2021.02.10.430604 on bioRxiv
Feb 10, 2021

META-DIFF: a k-mer-based pipeline that detects differentially abundant sequences in metagenomics whole genome sequencing

This article has 8 authors:
1. Louis-Maël Guéguen
2. Alban Mathieu
3. Simon Pelletier
4. Anthony Woo
5. Namita Misra
6. Magali Moreau
7. Olivier Perin
8. Arnaud Droit
This article has no evaluationsLatest version Jan 29, 2026
Shotgun metagenomics: a deep insight into the composition and function of the complex microbial world

This article has 7 authors:
1. Grazia Visci
2. Elisabetta Notario
3. Giuseppe Defazio
4. Mariano Francesco Caratozzolo
5. Bruno Fosso
6. Marinella Marzano
7. Graziano Pesole
This article has no evaluationsLatest version Jan 30, 2026
MiCoReCa (Microbiome Community Resource Catalogue) - Towards Centralized Curation And Integration Of Microbiome Bioinformatics Resources

This article has 8 authors:
1. Vivek Ashokan
2. Clara Emery
3. Agnès Barnabé
4. Valentin Loux
5. Christina Pavloudi
6. Paul Zierep
7. Nikolaos Strepis
8. Bérénice Batut
This article has no evaluationsLatest version Jan 6, 2026

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Availability

Article activity feed

Related articles

META-DIFF: a k-mer-based pipeline that detects differentially abundant sequences in metagenomics whole genome sequencing

Shotgun metagenomics: a deep insight into the composition and function of the complex microbial world

MiCoReCa (Microbiome Community Resource Catalogue) - Towards Centralized Curation And Integration Of Microbiome Bioinformatics Resources