Diverse and abundant phages exploit conjugative plasmids

Natalia Quinones-Olvera
Siân V. Owen
Lucy M. McCully
Maximillian G. Marin
Eleanor A. Rand
Alice C. Fan
Oluremi J. Martins Dosumu
Kay Paul
Cleotilde E. Sanchez Castaño
Rachel Petherbridge
Jillian S. Paull
Michael Baym

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (Arcadia Science)

Abstract

Phages exert profound evolutionary pressure on bacteria by interacting with receptors on the cell surface to initiate infection. While the majority of phages use chromosomally encoded cell surface structures as receptors, plasmid-dependent phages exploit plasmid-encoded conjugation proteins, making their host range dependent on horizontal transfer of the plasmid. Despite their unique biology and biotechnological significance, only a small number of plasmid-dependent phages have been characterized. Here we systematically search for new plasmid-dependent phages targeting IncP and IncF plasmids using a targeted discovery platform, and find that they are common and abundant in wastewater, and largely unexplored in terms of their genetic diversity. Plasmid-dependent phages are enriched in non-canonical types of phages, and all but one of the 65 phages we isolated were non-tailed, and members of the lipid-containing tectiviruses, ssDNA filamentous phages or ssRNA phages. We show that plasmid-dependent tectiviruses exhibit profound differences in their host range which is associated with variation in the phage holin protein. Despite their relatively high abundance in wastewater, plasmid-dependent tectiviruses are missed by metaviromic analyses, underscoring the continued importance of culture-based phage discovery. Finally, we identify a tailed phage dependent on the IncF plasmid, and find related structural genes in phages that use the orthogonal type 4 pilus as a receptor, highlighting the evolutionarily promiscuous use of these distinct contractile structures by multiple groups of phages. Taken together, these results indicate plasmid-dependent phages play an under-appreciated evolutionary role in constraining horizontal gene transfer via conjugative plasmids.

Version published to 10.1038/s41467-024-47416-z
Apr 12, 2024
Arcadia Science
Apr 25, 2023

Kraken2 (See Methods)

Do you have a sense for how robust Kraken2 should be to the degree of polymorphisms that you see across tectiviruses? Since its a kmer-based tool at some point enough polymorphisms would break it but idk where that threshold is relative to what you see in natural tectivirus populations. Overall I'm really fascinated that these viruses are evading your detection in metagenomes, and im super curious as to why.

Another approach would to assemble the viromes and then see if you can pull out contigs with BLAST hits to tectiviral marker genes. You could also try querying the for the marker genes directly on the assembly graph with a tool like spacegraphcats to ensure that you aren't missing the virus due to assembly failure.

By assembling the metavirome that would also allow you to demonstrate some of the feature of …

Kraken2 (See Methods)

Do you have a sense for how robust Kraken2 should be to the degree of polymorphisms that you see across tectiviruses? Since its a kmer-based tool at some point enough polymorphisms would break it but idk where that threshold is relative to what you see in natural tectivirus populations. Overall I'm really fascinated that these viruses are evading your detection in metagenomes, and im super curious as to why.

Another approach would to assemble the viromes and then see if you can pull out contigs with BLAST hits to tectiviral marker genes. You could also try querying the for the marker genes directly on the assembly graph with a tool like spacegraphcats to ensure that you aren't missing the virus due to assembly failure.

By assembling the metavirome that would also allow you to demonstrate some of the feature of the viruses that you do recover using these methods. Are there dsDNA viruses smaller that tectiviruses recovered robustly? I would imagine yes bc even tho 12kb is smallish, its not unreasonably tiny. Some of the phages that I've worked w in metagenomes have also been around this size range and they are incredibly abundant and easy to find, which makes me feel like something else is going on.

Read the original source
Arcadia Science
Apr 21, 2023

Thank you for taking the time reading and commenting!

Read the original source
Arcadia Science
Apr 21, 2023

Great points, I hope my responses throughout clarify some of these observations. I've really enjoyed going through the comments, and it's tremendously useful feedback. Thanks for taking the time!

Read the original source
Arcadia Science
Apr 21, 2023

Interesting point. We didn't have the chance to really dig into why they're getting lost in metagenomes. I would love to find out! There might be a bit some extraction biases, but I also think that having genomes an order of magnitude smaller than other phages doesn't help. Definitely very interesting to look more carefully into this, might give us good ideas about other things we're missing.

Read the original source
Arcadia Science
Apr 21, 2023

Again, hopefully this is more clear with the missing paragraphs, but the overall idea was: First, let's look at the IMG assembled genomes (diverse in terms of environments). Didn't find any alphatectiviruses. Maybe they're not getting sequenced well, maybe they're just not getting assembled? We can give ourselves a better shot at finding them if we look in a more targeted way. Given that we find them at very high abundances in all the wastewater we test, let's do a small test sequencing the very samples where we measured abundance (Fig4a). We did this, and found very few reads assigned to alphatectiviruses (4d, red dots). They'll never assemble from our dataset. Can we maybe overcome this with numbers? Let's try looking at a bunch of other wastewater datasets, picking some larger ones, some with different processing techniques. We do …

Again, hopefully this is more clear with the missing paragraphs, but the overall idea was: First, let's look at the IMG assembled genomes (diverse in terms of environments). Didn't find any alphatectiviruses. Maybe they're not getting sequenced well, maybe they're just not getting assembled? We can give ourselves a better shot at finding them if we look in a more targeted way. Given that we find them at very high abundances in all the wastewater we test, let's do a small test sequencing the very samples where we measured abundance (Fig4a). We did this, and found very few reads assigned to alphatectiviruses (4d, red dots). They'll never assemble from our dataset. Can we maybe overcome this with numbers? Let's try looking at a bunch of other wastewater datasets, picking some larger ones, some with different processing techniques. We do recover reads that look good, but still, not enough for assembling. (Also, we worry that even with enough reads, the very high degree of polymorphisms would make it difficult to resolve the graph.)

Read the original source
Arcadia Science
Apr 21, 2023

Hopefully the missing paragraphs account for this! (and we'll work on integrating the rest of your feedback too!)

Read the original source
Arcadia Science
Apr 21, 2023

Although we only labeled PRD1, the tree actually includes the 12 that we think correspond to distinct species, but they all overlap at this scale, showing how different the IMG genomes really are from the alphas. (Here are the tree files if you wish to see the individual labels.) We'll clarify this in the text.

Read the original source
Arcadia Science
Apr 21, 2023

The IMG search with the coat protein does retrieve a bunch of tectiviruses, but none of them seem to be alphatectiviruses. The pfam model for the coat (PF09018) is built using only the PRD1 sequence, but the model hits the alphas, the gamma, and that whole limbo of uncultured ones. Looking at them (4b,c) you can see that they're quite different from PRD1. I have a tree, similar to 4c, but built only with the coat protein. Maybe that would narratively easier to follow: Search with coat -> Found hits -> Tree of coat shows they're related, but not quite alphas. I decided to go for the terminase tree instead because, while with the coat you can't really get a good alignment that includes all the known tectiviruses, the terminase one allowed me to include the beta-, delta- and epsilon- tectiviruses, so I thought it provided a better context …

The IMG search with the coat protein does retrieve a bunch of tectiviruses, but none of them seem to be alphatectiviruses. The pfam model for the coat (PF09018) is built using only the PRD1 sequence, but the model hits the alphas, the gamma, and that whole limbo of uncultured ones. Looking at them (4b,c) you can see that they're quite different from PRD1. I have a tree, similar to 4c, but built only with the coat protein. Maybe that would narratively easier to follow: Search with coat -> Found hits -> Tree of coat shows they're related, but not quite alphas. I decided to go for the terminase tree instead because, while with the coat you can't really get a good alignment that includes all the known tectiviruses, the terminase one allowed me to include the beta-, delta- and epsilon- tectiviruses, so I thought it provided a better context of where the uncultivated tectiviruses fall. I've added that data to the GitHub if you're interested. Great feedback, we'll try to make it more clear and include additional supplementary materials to make this part clearer!

Read the original source
Arcadia Science
Apr 21, 2023

Yes! It has also been corrected in the updated version.

Read the original source
Arcadia Science
Apr 21, 2023

Good point!

Read the original source
Arcadia Science
Apr 21, 2023

YES! Thank you so much for these comments, it made us realize that at some point we lost two full paragraphs (!) from the manuscript. We've now updated the bioRxiv version to have them. (At the time of this comment it's not live yet, but hopefully soon.)

Read the original source
Arcadia Science
Apr 21, 2023

Thank you for taking the time read it and to write all these comments! (Love this!)

Read the original source
Arcadia Science
Apr 14, 2023

highlights the power of Phage DisCo to uncover unknown phage diversity

love your approach and congrats on finding cool diversity!

Read the original source
Arcadia Science
Apr 14, 2023

indicating a blind spot in bulk-sequencing

Looking at the metagenomes you searched, it seems like they were solely wastewater metagenomes- is there a reason to think that lncP PDPs are not found in other environments? Think the claims in this paragraph may just be a little too broad stated

Read the original source
Arcadia Science
Apr 14, 2023

Viral

Overall amazing work! The assay is really clever, and you pulled out a ton of cool phage diversity. I really liked the phenotypic host range assessment - it was a suprise that you got such different phenotypes from such similar genotypes. Congrats on a fascinating paper!

Read the original source
Arcadia Science
Apr 14, 2023

no accessory genes

wild that they don't have accessory genes!!!

Read the original source
Arcadia Science
Apr 14, 2023
Still, the discrepancy points to the continued need for systematic culture-based viral discovery.

It would be helpful to evaluate these possibilities more systematically in the section of the paper where you discuss the absence of tectiviruses from metagenomic sequencing datasets.
1. Did you have any trouble isolating DNA from the tectiviruses in culture that would hint at their exclusion from bulk DNA harvests? If you suspect the the nature of the DNA is the problem, you could consider looking at RNAseq datasets for tectiviral reads (see https://pubmed.ncbi.nlm.nih.gov/32517038/)
2. When you sequenced the wastewater, did you find the amount of tectivirus reads you expected to? Some of your figures might be getting at that, but they are not discussed in the text (4D, E) making it hard to know what your interpretation is.
3. If you did recover a …
Still, the discrepancy points to the continued need for systematic culture-based viral discovery.

It would be helpful to evaluate these possibilities more systematically in the section of the paper where you discuss the absence of tectiviruses from metagenomic sequencing datasets.

Did you have any trouble isolating DNA from the tectiviruses in culture that would hint at their exclusion from bulk DNA harvests? If you suspect the the nature of the DNA is the problem, you could consider looking at RNAseq datasets for tectiviral reads (see https://pubmed.ncbi.nlm.nih.gov/32517038/)

When you sequenced the wastewater, did you find the amount of tectivirus reads you expected to? Some of your figures might be getting at that, but they are not discussed in the text (4D, E) making it hard to know what your interpretation is.

If you did recover a reasonable amount of tectivirus reads from your sequencing, did they assemble into genomes? Assembly issues feels the most likely culprit to me, given how shockingly similar their genome content is, alongside the high levels of nt diversity at some positions. Two ways to approach this problem would be to either work directly from assembly graphs using a tool like spacegraphcats (https://github.com/spacegraphcats) or to use long read sequencing. There might already be some good wastewater long read metagenomes out there to look at.

If assembly isn't the problem and they do assemble into genomes, are those genomes found by tools like genomad (https://github.com/apcamargo/genomad) which was used to make the current iteration of img/vr?
Read the original source
Arcadia Science
Apr 14, 2023

their absence from metagenomic datasets

I think this is an interesting point, but overall feel that the metagenomic section of the results could use a little more fleshing out/explanation

Read the original source
Arcadia Science
Apr 14, 2023

DNA genomes with covalently bound proteins2

would looking at metatranscriptomes be informative here? would you expect that phages that drop out of metagenomes for some of these reasons to still show up in metatranscriptomes?

Read the original source
Arcadia Science
Apr 14, 2023

e,

related to adding text relevant to d and e, it might be nice to note in it if the metagenomes where reads did map to PRD1 have anything in common (eg similar environments or geography)

Read the original source
Arcadia Science
Apr 14, 2023

However, none of the uncultivated viral genomes appear to belong to any of the pre-existing groups of isolated tectiviruses

It might be helpful to include your isolated phages in this tree to see IMG genomes cluster with them. Since you only have one representative of alphatectiviruses on the tree it makes it more challenging to conclude the relatatedness of the IMG genomes to isolated alphatectiviruses more broadly

Read the original source
Arcadia Science
Apr 14, 2023

DNA packaging ATPase

might be worth calling this a terminase , if that may be more recognizable by readers

Read the original source
Arcadia Science
Apr 14, 2023

Figure 4

Should there be text referencing Figure 4 d and e? I am having trouble finding it. Having corresponding text referencing d and e would be really helpful in following the story and in creating a convincing argument that these viruses are missing from metagenomes

Read the original source
Arcadia Science
Apr 14, 2023

alphatectiviruses have yet to be found in metagenomic analyses

What do you mean no one has found them? Below, you say that you were able to retrieve viral genomes from IMG using the PRD1 coat protein which is at odds with that statement.

Read the original source
Arcadia Science
Apr 14, 2023

c,

Do you mean D?

Read the original source
Arcadia Science
Apr 14, 2023

We speculate that these patterns might reflect the compositions of natural polymicrobial communities containing IncP plasmids, which require PDPs to rapidly adapt to infect particular assortments of taxonomically distant hosts.

This patchy host range could also be due to specialized anti-PRD immune systems where protection against one PRD doesn't necessarily guarantee protection from a closely related PRD

Read the original source
Arcadia Science
Apr 14, 2023

This observation formed the basis of the targeted phage discovery method we termed “Phage discovery by coculture” (Phage DisCo)

This is such a clever approach! I really love it. You could imagine expanding to all sorts of other questions about phage host range, like finding phages that are resistant to different bacteria immune systems.

Read the original source
Arcadia Science
Apr 14, 2023

(Phage DisCo) (Figure 1b).

I love this assay! So simple but also powerful. Figure 1 is really nicely laid out.

Read the original source
Version published to 10.1101/2023.03.19.532758 on bioRxiv
Mar 19, 2023

Chemical inactivation of a bacterial immune system de-domesticates a temperate phage and promotes its spread

This article has 7 authors:
1. Yanyao Cai
2. Jaka Jakin Lazar
3. Yun Shi
4. Biswa P. Mishra
5. Thomas Ve
6. Anna Dragoš
7. Joseph P. Gerdt
This article has no evaluationsLatest version May 15, 2026
Genetic manipulation of a giant virus-associated virophage

This article has 3 authors:
1. Jingjie Chen
2. Hiroyuki Ogata
3. Hiroyuki Hikida
This article has no evaluationsLatest version Jun 17, 2026
Phages infecting the common gut commensal Escherichia coli HS reveal tropism for its Klebsiella -like capsule

This article has 14 authors:
1. Aleksandr Shenfeld
2. Oksana Kotovskayga
3. Kirill Petrikov
4. Alla Golomidova
5. Alina Demkina
6. Maria Zavialova
7. Olha Dorozh
8. Olga Komarova
9. Polina Iarema
10. Valentina M. Krasilnikova
11. Nikolay V. Volozhantsev
12. Konstantin Severinov
13. Andrei Letarov
14. Artem Isaev
This article has no evaluationsLatest version May 19, 2026

Diverse and abundant phages exploit conjugative plasmids

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Chemical inactivation of a bacterial immune system de-domesticates a temperate phage and promotes its spread

Genetic manipulation of a giant virus-associated virophage

Phages infecting the common gut commensal Escherichia coli HS reveal tropism for its Klebsiella -like capsule

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Chemical inactivation of a bacterial immune system de-domesticates a temperate phage and promotes its spread

Genetic manipulation of a giant virus-associated virophage

Phages infecting the common gut commensal Escherichia coli HS reveal tropism for its Klebsiella -like capsule