Co-Occurrence and Cooperation between Comammox and Anammox Bacteria in a Full-Scale Attached Growth Municipal Wastewater Treatment Process

This article has been Reviewed by the following groups

Read the full article

Listed in

Log in to save this article

Abstract

No abstract available

Article activity feed

  1. and the past study9

    I'm not familiar with this past study - were the MAGs from this past study retrieved from the same WWTP? Was a certain mapping threshold such as coverage or breadth used to ensure that there is actually a similar population represented by that genome present in the sample (breadth = how much of the genome is actually covered. For example if you have a breadth of 90% and coverage of 20X then 90% of the genome is covered, but if you have a really high coverage but low breadth then you could be mapping to something just super conserved and not that specific population)

  2. Each phylogenomic tree was constructed using ITOL v2.1.7

    I wasn't aware that ITOL could construct the phylogenetic tree, I've only used it as a tree viewing program. There should be mention of what program was used to construct the tree from the muscle alignment (FastTree, RAxML for example) and the parameters used for the tree building program

  3. To compare comammox and anammox ammonia oxidation rates with those reported in literature, abundance adjusted rates (μmol N/mg protein-h) were calculated by dividing the average ammonia consumption rate (mg-N/g TS-h) obtained from aerobic or anaerobic ammonia oxidation batch assays by the portion of total metagenomic reads mapping to comammox or anammox bacteria metagenome assembled genomes (see below) as their approximate contribution to total solids measured and then using the conversion factor 1.9 mg dry weight/mg protein25.

    This adjusted abundance calculation based on metagenomic reads mapping back to anammox/comammox MAGs seems highly dependent on how contiguous your assembly is or if you retrieved the actual population in the assembled MAG responsible for this activity. Therefore I'm worried if this is the best or most accurate way to make this rate calculation and if there is a better way to do this either through lineage-specific qPCR primers or an activity-based assay.

  4. The decrease in the Nitrospira abundance could be the reason why several of the previously assembled MAGs could not be assembled in the current study despite the fact that 5 out of 7 of the previously assembled Nitrospira MAGs had 90% of their genomes covered using reads from this study

    Again, I don't think this is the only reason. You could try to answer this with a coassembly even though it will increase complexity and sometimes make things more fragmented. With this coassembly then you could just pull out putative comammox bins of interest and ignore anything else. The other possibility is that although in the previous study (although I haven't read it) that you observed low strain diversity there could be higher diversity for these samples and also lead to difficulties in assembly

  5. but at very low abundances and thus their genomes were not successfully reconstructed.

    I'm not sure this means that the potential comammox bacteria/AOB were at low abundance and that's why they didn't assemble. It could be that there was higher strain diversity in these samples than those from which the previous MAGs were assembled from, and the contig you aligned with high percentage is just highly conserved or has low diversity. You could instead see if the contig with amoA ended up in a low quality bin and calculating nucleotide diversity on those contigs to see if other contigs have high diversity and that could be a reason why it didn't assemble well

  6. Biomass attached to six pieces of media collected from the aeration tank were scrapped using a sterile scalpel and homogenized using a sterile loop.

    I might just be misunderstanding how the apparatus or biofilm is structured, but is it fine to homogenize biomass from six pieces of media in this way? Is it expected that from these different pieces they should be pretty similar or would heterogeneity impact downstream analysis?

  7. mapping all sample reads

    I think I'm confused by how many samples there are - from the methods above for DNA extraction it makes it seem that there are 6 samples that are homogenized into one and there is only one sample that is sequenced. Whereas here there is reference to multiple samples that reads are mapped from.

  8. Therefore, the relative abundance of all nitrifying groups was calculated from a set of dereplicated MAGs recovered from both studies (Table SI-3).

    I think this could potentially be an inaccurate way to do this if you don't have statistics for coverage and breadth mentioned in a prior comment to make sure these populations are actually "present" in the sample. For example in Crits Cristoph et al. for mapping reads from soil samples to MAGs they required at least 50% of the genome to be covered at 5X, so the breadth is .5 here. I can't tell from this statement here if you are requiring 50X coverage or 50% breadth at what specific coverage. Because you refer to the 50% as coverage but explain it as the definition for breadth it's a little confusing

  9. Nitrospira and Brocadia MAGs represented 6.53 ± 0.34 % and 6.25 ± 1.33% of total reads in the sample

    It might be good to also include stats of the % of reads mapping back to the entire metagenomic assembly to give context for how complete your recovery effort was

  10. Further, the genome coverage of previously assembled comammox (JAMMSM_CMX_1) and Nitrosomonas (JAMMSM_AOB_1) MAGs were 80.6 ± 9.8 and 72.3 ± 1.0%, respectively

    So I think I'm answering my previous question here where the prior assemblies have coverage of 80X and 70X approximately and you required they have at least 50% breadth? I think this could be clarified more and report the actual breadth the genomes have for these samples mapping back to them. I've seen for full-scale WWTPs reads that map back to MAGs retrieved from different samples with breadth as high in the 90%+ range

  11. Methodological details, additional figures, and tables are provided in Supplemental Materials.

    Looking further in the SI I think there is some confusion about what genome coverage is referred to, as it's also flip flopped in the main text. Coverage is how many times a position is covered with reads, so 20X coverage means it is covered 20 times with reads that overlap that particular region. This is also referred to as depth. The calculation that I see in the SI table for "genome coverage" and sometimes referred to throughout the text is actually breadth, which is the percent of the genome that is covered which should be between 0 and 1. This is described in the inStrain paper: https://www.nature.com/articles/s41587-020-00797-0. I'm not sure if the authors are getting these coverage/breadth calculations from coverM or inStrain but it's a little confusing in the paper which they are referring to, which is an important distinction when using genomes that were assembled outside of the samples in question.

  12. Supporting information

    I didn't see a section describing the data availability for the metagenome or MAGs assembled in this study - will the data be made publicly available in the SRA/Genbank?

  13. Comammox and Nitrosomonas relative abundances were about 0.90 ± 0.8 RPKM and 0.40 ± 0.05 RPKM, respectively (Figure 5C). This differs from our prior work, where comammox and Nitrosomonas relative abundances were 22 ± 6.26 and 21.04 ± 6.17 RPKM, respectively (Figure 5B). Thus, it is very likely that the low abundance of comammox bacteria and Nitrosomonas affected the assembly and binning process, which did not allow for the reconstruction of these genomes even though they are still present in the system.

    I'm confused by which mapping stats to which MAGs you are referring to to come to this statement - is the relative abundance to the MAGs assembled from the prior study that is low and therefore inferring that's why you couldn't assemble comammox MAGs from this study?

  14. Brocadia (n=2) and Nitrospira (n=3) MAGs recovered from this study (Table SI-3) were

    I think the table describing these 5 MAGs should be a main table (still have the SI table describing the reference genomes) and modify the table to include the GTDB taxonomy, % GC, length in Mbp (or make clear the units) and no significant figures on number of contigs. You might also want to include in this table the relative abundance calculation per sample for each genome.

  15. Methodological details, additional figures, and tables are provided in Supplemental Materials.

    Looking further in the SI I think there is some confusion about what genome coverage is referred to, as it's also flip flopped in the main text. Coverage is how many times a position is covered with reads, so 20X coverage means it is covered 20 times with reads that overlap that particular region. This is also referred to as depth. The calculation that I see in the SI table for "genome coverage" and sometimes referred to throughout the text is actually breadth, which is the percent of the genome that is covered which should be between 0 and 1. This is described in the inStrain paper: https://www.nature.com/articles/s41587-020-00797-0. I'm not sure if the authors are getting these coverage/breadth calculations from coverM or inStrain but it's a little confusing in the paper which they are referring to, which is an important distinction when using genomes that were assembled outside of the samples in question.

  16. Brocadia (n=2) and Nitrospira (n=3) MAGs recovered from this study (Table SI-3) were

    I think the table describing these 5 MAGs should be a main table (still have the SI table describing the reference genomes) and modify the table to include the GTDB taxonomy, % GC, length in Mbp (or make clear the units) and no significant figures on number of contigs. You might also want to include in this table the relative abundance calculation per sample for each genome.

  17. Supporting information

    I didn't see a section describing the data availability for the metagenome or MAGs assembled in this study - will the data be made publicly available in the SRA/Genbank?

  18. Comammox and Nitrosomonas relative abundances were about 0.90 ± 0.8 RPKM and 0.40 ± 0.05 RPKM, respectively (Figure 5C). This differs from our prior work, where comammox and Nitrosomonas relative abundances were 22 ± 6.26 and 21.04 ± 6.17 RPKM, respectively (Figure 5B). Thus, it is very likely that the low abundance of comammox bacteria and Nitrosomonas affected the assembly and binning process, which did not allow for the reconstruction of these genomes even though they are still present in the system.

    I'm confused by which mapping stats to which MAGs you are referring to to come to this statement - is the relative abundance to the MAGs assembled from the prior study that is low and therefore inferring that's why you couldn't assemble comammox MAGs from this study?

  19. Further, the genome coverage of previously assembled comammox (JAMMSM_CMX_1) and Nitrosomonas (JAMMSM_AOB_1) MAGs were 80.6 ± 9.8 and 72.3 ± 1.0%, respectively

    So I think I'm answering my previous question here where the prior assemblies have coverage of 80X and 70X approximately and you required they have at least 50% breadth? I think this could be clarified more and report the actual breadth the genomes have for these samples mapping back to them. I've seen for full-scale WWTPs reads that map back to MAGs retrieved from different samples with breadth as high in the 90%+ range

  20. Therefore, the relative abundance of all nitrifying groups was calculated from a set of dereplicated MAGs recovered from both studies (Table SI-3).

    I think this could potentially be an inaccurate way to do this if you don't have statistics for coverage and breadth mentioned in a prior comment to make sure these populations are actually "present" in the sample. For example in Crits Cristoph et al. for mapping reads from soil samples to MAGs they required at least 50% of the genome to be covered at 5X, so the breadth is .5 here. I can't tell from this statement here if you are requiring 50X coverage or 50% breadth at what specific coverage. Because you refer to the 50% as coverage but explain it as the definition for breadth it's a little confusing

  21. The decrease in the Nitrospira abundance could be the reason why several of the previously assembled MAGs could not be assembled in the current study despite the fact that 5 out of 7 of the previously assembled Nitrospira MAGs had 90% of their genomes covered using reads from this study

    Again, I don't think this is the only reason. You could try to answer this with a coassembly even though it will increase complexity and sometimes make things more fragmented. With this coassembly then you could just pull out putative comammox bins of interest and ignore anything else. The other possibility is that although in the previous study (although I haven't read it) that you observed low strain diversity there could be higher diversity for these samples and also lead to difficulties in assembly

  22. Nitrospira and Brocadia MAGs represented 6.53 ± 0.34 % and 6.25 ± 1.33% of total reads in the sample

    It might be good to also include stats of the % of reads mapping back to the entire metagenomic assembly to give context for how complete your recovery effort was

  23. but at very low abundances and thus their genomes were not successfully reconstructed.

    I'm not sure this means that the potential comammox bacteria/AOB were at low abundance and that's why they didn't assemble. It could be that there was higher strain diversity in these samples than those from which the previous MAGs were assembled from, and the contig you aligned with high percentage is just highly conserved or has low diversity. You could instead see if the contig with amoA ended up in a low quality bin and calculating nucleotide diversity on those contigs to see if other contigs have high diversity and that could be a reason why it didn't assemble well

  24. mapping all sample reads

    I think I'm confused by how many samples there are - from the methods above for DNA extraction it makes it seem that there are 6 samples that are homogenized into one and there is only one sample that is sequenced. Whereas here there is reference to multiple samples that reads are mapped from.

  25. and the past study9

    I'm not familiar with this past study - were the MAGs from this past study retrieved from the same WWTP? Was a certain mapping threshold such as coverage or breadth used to ensure that there is actually a similar population represented by that genome present in the sample (breadth = how much of the genome is actually covered. For example if you have a breadth of 90% and coverage of 20X then 90% of the genome is covered, but if you have a really high coverage but low breadth then you could be mapping to something just super conserved and not that specific population)

  26. Each phylogenomic tree was constructed using ITOL v2.1.7

    I wasn't aware that ITOL could construct the phylogenetic tree, I've only used it as a tree viewing program. There should be mention of what program was used to construct the tree from the muscle alignment (FastTree, RAxML for example) and the parameters used for the tree building program

  27. Biomass attached to six pieces of media collected from the aeration tank were scrapped using a sterile scalpel and homogenized using a sterile loop.

    I might just be misunderstanding how the apparatus or biofilm is structured, but is it fine to homogenize biomass from six pieces of media in this way? Is it expected that from these different pieces they should be pretty similar or would heterogeneity impact downstream analysis?

  28. To compare comammox and anammox ammonia oxidation rates with those reported in literature, abundance adjusted rates (μmol N/mg protein-h) were calculated by dividing the average ammonia consumption rate (mg-N/g TS-h) obtained from aerobic or anaerobic ammonia oxidation batch assays by the portion of total metagenomic reads mapping to comammox or anammox bacteria metagenome assembled genomes (see below) as their approximate contribution to total solids measured and then using the conversion factor 1.9 mg dry weight/mg protein25.

    This adjusted abundance calculation based on metagenomic reads mapping back to anammox/comammox MAGs seems highly dependent on how contiguous your assembly is or if you retrieved the actual population in the assembled MAG responsible for this activity. Therefore I'm worried if this is the best or most accurate way to make this rate calculation and if there is a better way to do this either through lineage-specific qPCR primers or an activity-based assay.