Predicting phage-bacteria interactions at the strain level from genomes

This article has been Reviewed by the following groups

Read the full article

Listed in

Log in to save this article

Abstract

Predicting how phages can selectively infect specific bacterial strains holds promise for developing novel approaches to combat bacterial infections and better understanding microbial ecology. Experimental studies on phage-bacteria interactions have been mostly focusing on a few model organisms to understand the molecular mechanisms which makes a particular bacterial strain susceptible to a given phage. However, both bacteria and phages are extremely diverse in natural contexts. How well the concepts learned from well-established experimental models generalize to a broad diversity of what is encountered in the wild is currently unknown. Recent advances in genomics allow to identify traits involved in phage-host specificity, implying that these traits could be utilized for the prediction of such interactions. Here, we show that we could predict outcomes of most phage-bacteria interactions at the strain level in Escherichia natural isolates based solely on genomic data. First, we established a dataset of experimental outcomes of phage-bacteria interactions of 403 natural, phylogenetically diverse, Escherichia strains to 96 bacteriophages matched with fully sequenced and genomically characterized strains and phages. To predict these interactions, we set out to define genomic traits with predictive power. We show that most interactions in our dataset can be explained by adsorption factors as opposed to antiphage systems which play a marginal role. We then trained predictive algorithms to pinpoint which interactions could be accurately predicted and where future research should focus on. Finally, we show the application of such predictions by establishing a pipeline to recommend tailored phage cocktails to target pathogenic strains from their genomes only and show higher efficiency of tailored cocktails on a collection of 100 pathogenic E. coli isolates. Altogether, this work provides quantitative insights into understanding phage–host specificity at the strain level and paves the way for the use of predictive algorithms in phage therapy.

Article activity feed

  1. To predict these interactions, we set out to define genomic traits with predictive power. We show that most interactions in our dataset can be explained by adsorption factors as opposed to antiphage systems which play a marginal role.

    This is a really useful and impressive effort, both for understanding basic phage biology as well as deploying phages therapeutically in the clinic. One thing that could be useful to consider adding to your genomic traits that you analyze here are endogenous prophages (and maybe even other MGEs). There are many documented ways that prophages can remodel the surface of the host and change absorption, as well have tons of interesting pro and anti immune system activities.

  2. Overall, these results show that bacterial defense systems are not required to predict the phage-bacteria interactions and can be removed from the set of candidate traits provided as input features to our models.

    Is one interpretation of this that anti-defense mechanisms are common, and so presence of a defense system is a poor predictor of phage sensitivity?

  3. However, this significance was lost whenever pairs of strains with a phylogenetic distance below 10-4 substitutions per position were removed (typically less than a few hundred SNPs on the whole core genome), indicating that the correlation was driven by very tightly related kins (Supplementary Figure 4). Our dataset shows that phylogeny poorly explains phage susceptibility.

    This is such a useful observation! I really appreciate that you performed this analysis with and without these highly related strains