Identifying Eukaryotes and Factors Influencing Their Biogeography in Drinking Water Metagenomes

This article has been Reviewed by the following groups

Read the full article See related articles

Listed in

Log in to save this article

Abstract

No abstract available

Article activity feed

  1. 3.2 Factors affecting eukaryotic abundance in DWDS metagenomes

    I'm not sure if this is helpful, but especially if you end up with specific genomes that you want to look for, you could try using sourmash branchwater: https://www.biorxiv.org/content/10.1101/2022.11.02.514947v1. If you have a eukaryotic genome you're interested in, you could sketch it (sourmash sketch) and then use the branchwater tool to search most metagenomes in the SRA to see which ones have high containment with the genome your searched. You could then use the SRA metadata tables to filter to wastewater samples and the dig in more to the biogeography of those.

  2. The majority of the sequenced data in metagenomic assemblies from complex environmental186samples are typically contained in short contigs (e.g., < 5 kbp), especially in case of complex187communities with low abundance organisms17,75,76

    This would be really helpful context to have in the introduction, since it would inform why you chose to structure the methods (short kb contigs) the way you did.

  3. k-mer signature differences

    Would you be willing to briefly describe the size of k-mer used for this? I could imagine very different results for k-mer size of 4 (tetranucleotide abundances) vs. 21 or 31 (which are generally genus or species specific)

  4. 3.2 Factors affecting eukaryotic abundance in DWDS metagenomes

    I'm not sure if this is helpful, but especially if you end up with specific genomes that you want to look for, you could try using sourmash branchwater: https://www.biorxiv.org/content/10.1101/2022.11.02.514947v1. If you have a eukaryotic genome you're interested in, you could sketch it (sourmash sketch) and then use the branchwater tool to search most metagenomes in the SRA to see which ones have high containment with the genome your searched. You could then use the SRA metadata tables to filter to wastewater samples and the dig in more to the biogeography of those.

  5. k-mer signature differences

    Would you be willing to briefly describe the size of k-mer used for this? I could imagine very different results for k-mer size of 4 (tetranucleotide abundances) vs. 21 or 31 (which are generally genus or species specific)

  6. The majority of the sequenced data in metagenomic assemblies from complex environmental186samples are typically contained in short contigs (e.g., < 5 kbp), especially in case of complex187communities with low abundance organisms17,75,76

    This would be really helpful context to have in the introduction, since it would inform why you chose to structure the methods (short kb contigs) the way you did.