CELEBRIMBOR: Pangenomes from metagenomes

This article has been Reviewed by the following groups

Read the full article See related articles

Listed in

Log in to save this article

Abstract

Summary

Metagenome Assembled Genomes (MAGs) are often incomplete, with sequences missing due to errors in assembly or low coverage. Incomplete MAGs present a particular challenge for identification of shared genes within a microbial population, known as core genes, as a core gene missing in only a few assemblies will result in it being mischaracterized at a lower frequency. Here, we present CELEBRIMBOR, a snakemake pangenome analysis pipeline which uses a measure of genome completeness to automatically adjust the frequency threshold at which core genes are identified, enabling accurate core gene identification in MAGs.

Availability and implementation

CELEBRIMBOR is published under open source Apache 2.0 licence at https://github.com/bacpop/CELEBRIMBOR and is available as a Docker container. Supplementary material is available in the online version of the article.

Article activity feed

  1. https://github.com/bacpop/CELEBRIMBOR

    Thanks for putting this together! I took a look at the repository and noticed a few changes that I think could help make CELEBRIMBOR more user friendly.

    1. Would you be willing to add tool versions to your environment.yml file? This will help make sure this will still be installable over time, and help the docker container match in results to users who deploy this on e.g. an hpc
    2. Does create_plots.py belong in the scripts/ directory instead?
    3. The hashes after the second equals sign in some of your yaml files will make the environments difficult to install across different operating systems (linux vs. mac) (ex https://github.com/bacpop/CELEBRIMBOR/blob/main/envs/Snakemake.yaml)
    4. The readme uses the old repo name in the clone/cd instructions
    5. Lastly, have you explored using something like a click interface & making the tool conda-installable? I know this is a big lift, but in my experience it makes it so much easier for others to pick up and use, including dropping the pipeline into larger pipelines. This pipeline might provide some inspiration for how to accomplish this: https://github.com/metagenome-atlas/atlas