Mapping the functional landscape of the receptor binding domain of T7 bacteriophage by deep mutational scanning

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

The interaction between a bacteriophage and its host is mediated by the phage's receptor binding protein (RBP). Despite its fundamental role in governing phage activity and host range, molecular rules of RBP function remain a mystery. Here, we systematically dissect the functional role of every residue in the tip domain of T7 phage RBP (1660 variants) by developing a high-throughput, locus-specific, phage engineering method. This rich dataset allowed us to cross compare functional profiles across hosts to precisely identify regions of functional importance, many of which were previously unknown. Substitution patterns showed host-specific differences in position and physicochemical properties of mutations, revealing molecular adaptation to individual hosts. We discovered gain-of-function variants against resistant hosts and host-constricting variants that eliminated certain hosts. To demonstrate therapeutic utility, we engineered highly active T7 variants against a urinary tract pathogen. Our approach presents a generalized framework for characterizing sequence–function relationships in many phage–bacterial systems.

Article activity feed

  1. Reviewer #4:

    This manuscript by Huss, P., et al, is a major technological step forward for high throughput phage research and is a deep dive into the deep mutational landscape of a portion of the T7 Phage receptor binding protein (RBP). The author’s develop a new phage genome engineering method, ORACLE, that can generate a library of any region of the phage genome. They apply ORACLE to do a deep mutational scan of the tip domain of T7 RBP and screen for enrichment in several bacteria. The authors find that different hosts give rise to distinct mutational profiles. Exterior loops involved in specialization towards a host appear to have the highest differential mutational sensitivity. The authors follow up these general scans in the background of phage resistant hosts. They find mutations that rescue phage infection. To demonstrate the utility of the approach on a clinically relevant task, the authors apply the library to a urinary tract associated clinical isolate and produce a phage with much higher specificity, creating a potentially powerful narrow scope antibiotic.

    Overall, the ORACLE method will be of tremendous use for the phage field solving a technical challenge associated with phage engineering and will illuminate new aspects of the bacterial host-phage interactions. It was also quite nice to see host-specialization validated and further explored with the screens done in the background of phage resistance mutations. The authors do a tremendous job digging into potential mechanisms when possible by which mutations could be altering fitness. We especially appreciate how well the identity of amino acids tracks host specialization within exterior loops.

    We have no major concerns about the manuscript but have some minor comments to aid interpretation. There are also some minor technical issues. We think this manuscript will be of broad interest, especially for those in the genotype-phenotype, phage biology, and host-pathogen fields.

    Minor comments:

    P5L20: In the introduction to the ORACLE section the authors mention homologous recombination then they mention using 'optimized recombination' that is done with recombinases. This contrast should be mentioned somewhere perhaps to highlight the benefit of having specific recombinases.

    P6L16: Using Cas9 to cut unrecombined variants is clever... Cool! This is a real 21st Century Dpn1 idea.

    P6L27 The authors state that there is a mild skew towards more abundant members after ORACLE. Why might this be? In iterations more abundant members simply become even more abundant? To be clear this isn't a substantial limitation and it's common to see these sorts of changes during library generation. Just curious. Overall looks like a fantastic method.

    P7L6: Authors mention ORACLE increases the throughput of screens by 3-4 orders of magnitude. How many variants can one screen? Is this screen of a little over 1k variants at about the threshold of the assay?

    P8L7: The authors assign functional scores based on enrichment and normalize to wild type. Is a FN=1 equivalent to wild type?

    P9L5: Awesome!

    P10L7: Authors mention R542 forms a hook with a receptor. There should be a citation here.

    P10L21: For N501, R542, G479, D540 there are wonderful mechanistic explanations. However, for D520 there is not. Any hypothesis for why this is distinct from the others? Are there other residues that behave similarly? I feel it would be really helpful to have a color scale that discriminates between FN 1 (assuming wild type) and enriched/depleted w/in figure 3A.

    P12L4: Authors note residues that are surface exposed yet intolerant to mutations in the previous paragraph. Authors also calculate free energy changes with Rosetta and state free energy maps pretty well with tolerance. What is the 93% based on? Perhaps a truth/contingency table would be useful here to discriminate/ compare groupings. What residues are in the 7% others. Can the energy scores help understand the mechanisms behind the mutations better?

    P12L7: Authors state substitutions predicted to stable and classified intolerant could indicate residues necessary for all hosts. What about those that fall outside of the groupings? Unstable residues can also be necessary.

    P14L22L Authors mention comparing systematic truncations, however they do not present any figure. This should be in a figure to aid in looking at the data and would surely be helpful to people in the phage field. A figure should be included here especially because this is one of the main discussion topics at the end of the manuscript.

    P16L2: The authors did the selection in the background of a clinically isolated strained and discussed 3 variants that were clonal characterized. Was this library sequenced similar to before?

    Figures:

    Barplots need significance tests.

    Figure 2C-E ; Fig 3A. All figures are colored white to red. With this color scale it's hard to appreciate which variants are neutral vs those that are enriched. A two or more color scale would be more appropriate. Log-scaling might be wise to get a better sense of the dynamic range that is clearly present in fig2F.

    FIg 4F: Needs a statistical test between bar plots.

    Fig6A-C: These figures have tiny symbols that represent the architecture at an insertion position. It's probably easier to look at if the same annotations from Fig 4B or C for architecture were used.

    Fig6D: needs tests for significance

    Supp fig 4E: This figure is the first evidence that the physics chemistry of amino acids w/in surface exposed loops determine host specificity. This is followed up by Figure 4D and E. I would consider moving this to one of the main figures.

    Supp fig 5: A truth table could be useful here to test for ability to classify based on rosetta compared to FD. It looks like here that the tolerant residues have a distinct pattern

    Why are these colored white to red?

  2. Reviewer #3:

    Huss et al. describe a phage genome engineering technology that they call ORACLE. This technique uses recombineering of a phage target gene with a variant library to identify both gain and loss of function mutations. The beauty of this method and what makes it superior to other techniques is that it dramatically limits loss of mutants that are less fit during the initial round of library generation. Thus, the pool of variants is vast and is reduced in bias toward more fit species based on the host used for initial library amplification. They use the model coliphage T7 as a proof of principle and show that several previously unidentified residues in the T7 tail fiber play critical roles in both loss and gain of function for phage infectivity and they also identify residues that are major drivers of altered host tropism. Lastly, they apply this library to a pathogenic UTI associated strain of E. coli which is normally resistant to wild type T7 infection and identify tail variants of T7 that can now infect this strain, highlighting the applicability of this method toward the discovery of engineered phages that could be used therapeutically. Altogether this is an important advancement in phage engineering that shows potential promise for future phage therapies.

  3. Reviewer #2:

    The authors are reporting a new approach termed ORACLE to develop locus-specific phage variants, which includes a recombination step, whose efficacy is improved by the overexpression of a dedicated recombinase, followed by an enrichment performed using CRISPR/Cas9. They applied this method to create a mutant library containing 1660 variants of the tip domain of the T7 tail fiber. Performance of each variant was determined by quantifying their abundance before and after selection on three E. coli strains compared to the WT phage. Their findings show that single amino acid changes in the tip of gp17 can have major consequences on phage performance on different hosts. Then they tested whether these variants would be less prone to select phage-resistant using an UTI strain. Finally, they searched for variants that would be more prone to infect one host than another and successfully tested their predictions.

    The ORACLE approach is overall novel and has some advantages over existing methods, mainly for generation of mutation libraries of genes. Authors did a nice (even if very lengthy) job of showing how mutants have consequences to structure and function of the tail fiber gene and how that influences performance on different hosts, including combating host resistance.

    The authors state that ORACLE overcomes three major hurdles that make it better than existing methods, one of which is "generalizability for virtually any phage", while denouncing other systems for being applicable for highly transformable hosts only. This is highly exaggerated since ORACLE requires transformation of two plasmids (helper and donor) including one with tunable gene expression, which is clearly not possible in many bacteria. Furthermore, the enrichment step requires a strain with a functional CRISPR/Cas9 system, which again is not so obvious in the bacterial world.

    The authors disregard bias that can be generated at the "O" step if a variant reproduces better than the wt. They should also mention bias arising from non-viable or severely infection hampered variants, which is briefly mentioned later in the manuscript but should be mentioned earlier, would not pass the accumulation step.

    The weakest paragraph is the one dealing with the UTI strain. I have the feeling that this paragraph could simply be deleted without changing the overall story. Approaching resistance, selection, and evolution would require more experiments than the very simplistic lysis curves. The authors did not even show adequately that cells growing after 5-10 hours are either genotypically or phenotypically resistant cells. A more appropriate qualification would be "insensitive" instead of resistant.

  4. Reviewer #1:

    Huss et al. have developed a novel tool (ORACLE) for generating libraries of phage variants. They go on to apply this tool to study the residues important for T7 host specificity, providing a rich dataset for in-depth functional studies. They validate a subset of hits and use this information to engineer T7 variants that may be able to overcome bacterial resistance against a urinary tract infection associated strain, consistent with their in vitro results. Their approach provides both a valuable new tool and intriguing biological insights prompting future studies.

    Major suggestions for improvement:

    1. The writing could be much more concise.

    2. Claims about generalizability should either be removed or supported by additional data. This study focused on a single phage gene and a single host bacterial species. As such, it is not clear if ORACLE will work well in other contexts.