Diverse and abundant phages exploit conjugative plasmids

This article has been Reviewed by the following groups

Read the full article See related articles

Listed in

Log in to save this article

Abstract

Phages exert profound evolutionary pressure on bacteria by interacting with receptors on the cell surface to initiate infection. While the majority of phages use chromosomally encoded cell surface structures as receptors, plasmid-dependent phages exploit plasmid-encoded conjugation proteins, making their host range dependent on horizontal transfer of the plasmid. Despite their unique biology and biotechnological significance, only a small number of plasmid-dependent phages have been characterized. Here we systematically search for new plasmid-dependent phages targeting IncP and IncF plasmids using a targeted discovery platform, and find that they are common and abundant in wastewater, and largely unexplored in terms of their genetic diversity. Plasmid-dependent phages are enriched in non-canonical types of phages, and all but one of the 65 phages we isolated were non-tailed, and members of the lipid-containing tectiviruses, ssDNA filamentous phages or ssRNA phages. We show that plasmid-dependent tectiviruses exhibit profound differences in their host range which is associated with variation in the phage holin protein. Despite their relatively high abundance in wastewater, plasmid-dependent tectiviruses are missed by metaviromic analyses, underscoring the continued importance of culture-based phage discovery. Finally, we identify a tailed phage dependent on the IncF plasmid, and find related structural genes in phages that use the orthogonal type 4 pilus as a receptor, highlighting the evolutionarily promiscuous use of these distinct contractile structures by multiple groups of phages. Taken together, these results indicate plasmid-dependent phages play an under-appreciated evolutionary role in constraining horizontal gene transfer via conjugative plasmids.

Article activity feed

  1. Kraken2 (See Methods)

    Do you have a sense for how robust Kraken2 should be to the degree of polymorphisms that you see across tectiviruses? Since its a kmer-based tool at some point enough polymorphisms would break it but idk where that threshold is relative to what you see in natural tectivirus populations. Overall I'm really fascinated that these viruses are evading your detection in metagenomes, and im super curious as to why.

    Another approach would to assemble the viromes and then see if you can pull out contigs with BLAST hits to tectiviral marker genes. You could also try querying the for the marker genes directly on the assembly graph with a tool like spacegraphcats to ensure that you aren't missing the virus due to assembly failure.

    By assembling the metavirome that would also allow you to demonstrate some of the feature of the viruses that you do recover using these methods. Are there dsDNA viruses smaller that tectiviruses recovered robustly? I would imagine yes bc even tho 12kb is smallish, its not unreasonably tiny. Some of the phages that I've worked w in metagenomes have also been around this size range and they are incredibly abundant and easy to find, which makes me feel like something else is going on.

  2. Great points, I hope my responses throughout clarify some of these observations. I've really enjoyed going through the comments, and it's tremendously useful feedback. Thanks for taking the time!

  3. Interesting point. We didn't have the chance to really dig into why they're getting lost in metagenomes. I would love to find out! There might be a bit some extraction biases, but I also think that having genomes an order of magnitude smaller than other phages doesn't help. Definitely very interesting to look more carefully into this, might give us good ideas about other things we're missing.

  4. Again, hopefully this is more clear with the missing paragraphs, but the overall idea was: First, let's look at the IMG assembled genomes (diverse in terms of environments). Didn't find any alphatectiviruses. Maybe they're not getting sequenced well, maybe they're just not getting assembled? We can give ourselves a better shot at finding them if we look in a more targeted way. Given that we find them at very high abundances in all the wastewater we test, let's do a small test sequencing the very samples where we measured abundance (Fig4a). We did this, and found very few reads assigned to alphatectiviruses (4d, red dots). They'll never assemble from our dataset. Can we maybe overcome this with numbers? Let's try looking at a bunch of other wastewater datasets, picking some larger ones, some with different processing techniques. We do recover reads that look good, but still, not enough for assembling. (Also, we worry that even with enough reads, the very high degree of polymorphisms would make it difficult to resolve the graph.)

  5. Although we only labeled PRD1, the tree actually includes the 12 that we think correspond to distinct species, but they all overlap at this scale, showing how different the IMG genomes really are from the alphas. (Here are the tree files if you wish to see the individual labels.) We'll clarify this in the text.

  6. The IMG search with the coat protein does retrieve a bunch of tectiviruses, but none of them seem to be alphatectiviruses. The pfam model for the coat (PF09018) is built using only the PRD1 sequence, but the model hits the alphas, the gamma, and that whole limbo of uncultured ones. Looking at them (4b,c) you can see that they're quite different from PRD1. I have a tree, similar to 4c, but built only with the coat protein. Maybe that would narratively easier to follow: Search with coat -> Found hits -> Tree of coat shows they're related, but not quite alphas. I decided to go for the terminase tree instead because, while with the coat you can't really get a good alignment that includes all the known tectiviruses, the terminase one allowed me to include the beta-, delta- and epsilon- tectiviruses, so I thought it provided a better context of where the uncultivated tectiviruses fall. I've added that data to the GitHub if you're interested. Great feedback, we'll try to make it more clear and include additional supplementary materials to make this part clearer!

  7. YES! Thank you so much for these comments, it made us realize that at some point we lost two full paragraphs (!) from the manuscript. We've now updated the bioRxiv version to have them. (At the time of this comment it's not live yet, but hopefully soon.)

  8. indicating a blind spot in bulk-sequencing

    Looking at the metagenomes you searched, it seems like they were solely wastewater metagenomes- is there a reason to think that lncP PDPs are not found in other environments? Think the claims in this paragraph may just be a little too broad stated

  9. Viral

    Overall amazing work! The assay is really clever, and you pulled out a ton of cool phage diversity. I really liked the phenotypic host range assessment - it was a suprise that you got such different phenotypes from such similar genotypes. Congrats on a fascinating paper!

  10. Still, the discrepancy points to the continued need for systematic culture-based viral discovery.

    It would be helpful to evaluate these possibilities more systematically in the section of the paper where you discuss the absence of tectiviruses from metagenomic sequencing datasets.

    1. Did you have any trouble isolating DNA from the tectiviruses in culture that would hint at their exclusion from bulk DNA harvests? If you suspect the the nature of the DNA is the problem, you could consider looking at RNAseq datasets for tectiviral reads (see https://pubmed.ncbi.nlm.nih.gov/32517038/)

    2. When you sequenced the wastewater, did you find the amount of tectivirus reads you expected to? Some of your figures might be getting at that, but they are not discussed in the text (4D, E) making it hard to know what your interpretation is.

    3. If you did recover a reasonable amount of tectivirus reads from your sequencing, did they assemble into genomes? Assembly issues feels the most likely culprit to me, given how shockingly similar their genome content is, alongside the high levels of nt diversity at some positions. Two ways to approach this problem would be to either work directly from assembly graphs using a tool like spacegraphcats (https://github.com/spacegraphcats) or to use long read sequencing. There might already be some good wastewater long read metagenomes out there to look at.

    4. If assembly isn't the problem and they do assemble into genomes, are those genomes found by tools like genomad (https://github.com/apcamargo/genomad) which was used to make the current iteration of img/vr?

  11. their absence from metagenomic datasets

    I think this is an interesting point, but overall feel that the metagenomic section of the results could use a little more fleshing out/explanation

  12. DNA genomes with covalently bound proteins2

    would looking at metatranscriptomes be informative here? would you expect that phages that drop out of metagenomes for some of these reasons to still show up in metatranscriptomes?

  13. e,

    related to adding text relevant to d and e, it might be nice to note in it if the metagenomes where reads did map to PRD1 have anything in common (eg similar environments or geography)

  14. However, none of the uncultivated viral genomes appear to belong to any of the pre-existing groups of isolated tectiviruses

    It might be helpful to include your isolated phages in this tree to see IMG genomes cluster with them. Since you only have one representative of alphatectiviruses on the tree it makes it more challenging to conclude the relatatedness of the IMG genomes to isolated alphatectiviruses more broadly

  15. Figure 4

    Should there be text referencing Figure 4 d and e? I am having trouble finding it. Having corresponding text referencing d and e would be really helpful in following the story and in creating a convincing argument that these viruses are missing from metagenomes

  16. alphatectiviruses have yet to be found in metagenomic analyses

    What do you mean no one has found them? Below, you say that you were able to retrieve viral genomes from IMG using the PRD1 coat protein which is at odds with that statement.

  17. We speculate that these patterns might reflect the compositions of natural polymicrobial communities containing IncP plasmids, which require PDPs to rapidly adapt to infect particular assortments of taxonomically distant hosts.

    This patchy host range could also be due to specialized anti-PRD immune systems where protection against one PRD doesn't necessarily guarantee protection from a closely related PRD

  18. This observation formed the basis of the targeted phage discovery method we termed “Phage discovery by coculture” (Phage DisCo)

    This is such a clever approach! I really love it. You could imagine expanding to all sorts of other questions about phage host range, like finding phages that are resistant to different bacteria immune systems.