Tuning aromatic contributions by site-specific encoding of fluorinated phenylalanine residues in bacterial and mammalian cells

Curation statements for this article:
  • Curated by Biophysics Colab

    Biophysics Colab logo

    Endorsement statement (3 October 2022)

    The preprint by Galles et al. reports the generation of pyrrolysine-based aminoacyl-tRNA synthetases capable of incorporating fluorinated phenylalanine non-canonical amino acids into proteins expressed in either bacteria or mammalian cells. For the most extensively characterized synthetases, fluorinated phenylalanine derivatives were successfully incorporated into GFP and two membrane proteins (CFTR and Nav1.5) at expression levels adequate for biochemical studies, suggesting that the approach could be combined with multiple different structural and biophysical techniques. The work provides a valuable tool that will enable the functional role of cation-pi interactions to be interrogated in both soluble and integral membrane proteins.

    (This endorsement by Biophysics Colab refers to version 2 of this preprint, which has been revised in response to peer review of version 1.)

This article has been Reviewed by the following groups

Read the full article See related articles

Listed in

Log in to save this article

Abstract

The aromatic side-chains of phenylalanine, tyrosine, and tryptophan interact with their environments via both hydrophobic and electrostatic interactions. Determining the extent to which these contribute to protein function and stability is not possible with conventional mutagenesis. Serial fluorination of a given aromatic is a validated method in vitro and in silico to specifically alter electrostatic characteristics, but this approach is restricted to a select few experimental systems. Here, we report a new group of pyrrolysine-based aminoacyl-tRNA synthetase/tRNA pairs that enable the site-specific encoding of a varied spectrum of fluorinated phenylalanine amino acids in E. coli and mammalian (HEK 293T) cells. By allowing the cross-kingdom expression of proteins bearing these unnatural amino acids at biochemical scale, these tools will enable deconstruction of biological mechanisms which utilize aromatic-pi interactions in structural and cellular contexts.

Statement of Significance

The aromatic side-chains of phenylalanine, tyrosine, and tryptophan are crucial for protein function and pharmacology due to their hydrophobic and electrostatic contributions to catalytic centers and ligand-binding pockets. However, few experimental approaches can chemically assess the functional roles of aromatics in cellular environments. The accepted computational method for aromatic interrogation is via serial fluorination, which lacks an experimental correlate in bacterial or mammalian cell systems. We have identified a family of synthetases to encode multiple different types of fluorinated phenylalanine residues in E. coli and HEK cells via nonsense suppression. The efficiency of these synthetases is sufficient to support biochemical characterization and structural determination of proteins with site-specific incorporation of unnatural phenylalanine analogs.

Article activity feed

  1. Endorsement statement (3 October 2022)

    The preprint by Galles et al. reports the generation of pyrrolysine-based aminoacyl-tRNA synthetases capable of incorporating fluorinated phenylalanine non-canonical amino acids into proteins expressed in either bacteria or mammalian cells. For the most extensively characterized synthetases, fluorinated phenylalanine derivatives were successfully incorporated into GFP and two membrane proteins (CFTR and Nav1.5) at expression levels adequate for biochemical studies, suggesting that the approach could be combined with multiple different structural and biophysical techniques. The work provides a valuable tool that will enable the functional role of cation-pi interactions to be interrogated in both soluble and integral membrane proteins.

    (This endorsement by Biophysics Colab refers to version 2 of this preprint, which has been revised in response to peer review of version 1.)

  2. Authors' response (20 July 2022)

    GENERAL ASSESSMENT

    This is an interesting preprint wherein the authors report the generation of pyrrolysine-based aminoacyl-tRNA synthetases capable of incorporating fluorinated phenylalanine non-canonical amino acids (ncAA) into proteins expressed in either bacteria or mammalian cells. Synthetase evolution was directed using para-methyl tetrafluorophenylalanine as an ncAA. The authors use several screens and assays to characterize individual synthetases using superfolder GFP to measure protein expression using fluorescence and then identify the incorporated ncAA using mass spectrometry. For the two most extensively characterized synthetases, a wide array of fluorinated phenylalanine derivatives can be successfully incorporated. Contaminating incorporate of phenylalanine can be detected when attempting to incorporate several ncAAs in E. coli, but this bleed through incorporation of phenylalanine is not observed for expression in mammalian cells. In the case of monofluorinated phenylalanine for expression in mammalian cells, incorporation in place of phenylalanine is observed in other regions of GFP outside the site directed using the amber codon, making it challenging to incorporate the less heavily fluorinated ncAAs. Finally, the authors demonstrate that GFP and two membrane proteins (CFTR and Nav1.5) can be expressed at levels adequate for biochemical studies with one of the synthetases in the presence of a trifluoro phenylalanine ncAA, suggesting that the approach should be feasible for combining with many structural and biophysical approaches. Overall, this is an interesting study that generates several new synthetases that have utility for incorporation of fluorinated phenylalanine derivatives that can be used for expression in both prokaryotic and eukaryotic expression systems and that would likely be feasible for use for structural and other mechanistic biophysical studies.

    RECOMMENDATIONS

    Revisions essential for endorsement:

    1. The results and methods sections could be improved throughout by taking the space to provide the reader with a clearer conceptualization of each step in the process implemented by the authors to evolve the synthetase to incorporate fluorinated phenylalanine derivatives. What was the starting synthetase (presumably from ref 39)? Which positions were subject to random mutagenesis? Is the library a new one or one used previously? Can the authors provide the sequences for the different evolved synthetases characterized here? In the methods it is stated that 17 unique sequences were identified, but why not report what they are? Might it be worthwhile discussing how the present results compare with earlier attempts to evolve the synthetase for other ncAAs? It was also not entirely clear to us how do the positive and negative selection screens work. How many rounds of positive-negative selection have been made? Which mutations have been identified in the variants? Why do the results shown in Fig. 2B seem to disagree with some of the results shown in Fig. 2C? C10 is mentioned in the methods section but not shown in Fig. 2C. More extensive citation of prior work would help but the work will be much more accessible to the general audience if the authors explain everything conceptually in the results section and add more details to the methods section. The screening protocol used to identify synthetase variants is explained in greater detail in the methods, however the corresponding Figure (Fig 2) is not very well explained, please provide more explanation in the legend or the text about the results presented. Round 1 and Round 2 correspond to 2 different rounds of positive-negative screening? Panel C: it's not clear what round 1 and round 2 stands for and why some of the mutants are in round 1 and why some others in round 2. Were the UP50 plots done for the" top performing synthetases" in both rounds? Please specify. Please state what the crosses mean in Fig. 2B.

    We thank you for these comments.

    -The specifics of the library and the active site sequences will be included in the final published version and were only omitted due to the nature of preprint publishing (to which we are still acclimating).

    -In the results section we clarified what is meant by "rounds" of selection as simply two independent attempts at screening the same library with the same starting amino acid.

    -Characterization of C10 was not included because although it came out of the screen as its own "hit," we realized its sequence was the same as one of the other enzymes identified.

    -Our recent methods paper Galles et al (MIE, 2021) as cited is a recent and very extensive step-by-step guide from our group describing the screening method. This paper goes into detail on the positive and negative screening, and includes and informative cartoon figure.

    -The positions with X's produced unreliable fluorescence data.

    1. The authors demonstrate that they can incorporate different fluorinated phenylalanine residues into GFP and that under similar conditions two membrane proteins can be expressed at reasonable levels. What is less clear is how well the ncAAs will be incorporated into different membrane proteins. Might the procedures employed for GFP work less effectively for other proteins? The claim that the technique is widely applicable to membrane proteins would be strengthened if the authors could provide evidence for robust incorporation of ncAAs into membrane proteins, but even if this is too challenging for the time being, the authors should openly discuss problems that might be encountered or what makes them optimistic that the synthetases developed here will be effective at incorporating ncAAs into proteins beyond what they have shown for GFP.

    We appreciate this point. In this respect, the western blots of CFTR and Nav 1.5 are intended to indicate relative expression compared to Wild Type; that being said, we and others have found that relative expression with synthetase is target- and position-dependent. In our opinion, the targets shown, two completely unrelated membrane proteins (CFTR and Nav 1.5) and GFP, cover a broad spectrum of potential uses. In the lab we have shown rescue in additional soluble and membrane proteins but those are related to specific future projects and beyond the scope of this study.

    Note that the reviewer suggestion below (#3- functional characterization of the rescued Nav 1.5 channel) provides a route to provide further evidence on the utility of the system and the robustness of expression. We accomplished this and included it in the revised preprint (see below).

    1. One opportunity for demonstrating robust trifluorophenylalanine incorporation into Nav1.5 might be to include functional data demonstrating that the gating properties of the channel are altered compared to control. Is the F1486 position sufficiently sensitive for a functional readout to provide at least qualitative information about the extent of ncAA incorporation? This would also demonstrate trafficking of the protein to the membrane as a functional channel. Although it is difficult to measure the intact molecular weight of hCFTR and hNaV 1.5 proteins due to their size by MS, have the authors tried LC-ESI-MS/MS analysis following enzymatic digestion? This could conceivably help to validate not only the incorporation of fluoro-Phe ncAAs, but also the site specificity of incorporation.

    We appreciate and have now addressed these points. See new figure 7 with patch clamp evidence of macroscopic expression (7G through 7J) and MS/MS spectra for F1486(2,3,6F Phe) Nav 1.5 (7F). Note that we did not expect a large functional effect of incorporating tri-fluoro Phe into this position, so showing rescue via unbiased biochemical (western blot) indication of full length expression is preferred (Figure 7C-D. Unsurprisingly, encoding of 2,3,6 trifluoro Phe at F1586 was functionally tolerated; large (multi nano Ampere), normally activating and inactivating currents were observed. (Fig 7G). However, as shown in the revised paper, we did discover that incorporation subtly enhanced inactivation (left shift in steady state inactivation and impaired recovery from inactivation Fig 7 I-J). As the binding of the IFM inactivation motif to its inactivated state receptor in Na__v_ _is believed to be driven by hydrophobic sources, subtle enhancement of inactivation via fluorination is consistent with the idea that fluorination can, in some cases increase stability of hydrophobic cores. We expound upon this in the relevant section of the results.

    Additional suggestions for the authors to consider:

    1. The sfGFP protein samples purified for intact LC-ESI-MS analysis can be used for MS/MS analysis. Most mass spectrometers have MS/MS capability. The protein sequences and the structures of F-Phe ncAAs are known. All these make the MS/MS validation applicable. Most importantly, the results would provide strong evidence of site-specific encoding of F-Phe in proteins.

    Thank you for this suggestion. To most efficiently address this point and above #3, we expressed and purified F1486(2,3,6F Phe) Nav 1.5 and subjected it to tryptic digestion and MS/MS. These data now comprise figure 7F. Indeed, these data confirmed incorporation at position F1486.

    1. Although a soluble protein was used to test the synthetases, the presentation gives the impression that the ultimate goal is for use on membrane proteins. Although membrane proteins are of interest to many and to the authors, why not present it as useful for both soluble and membrane proteins? Are there any known example of cation-pi interactions mediated by Phe in soluble proteins that would be worth investigating? A more general point is the authors could provide better framing or context by discussing how important cation-pi interactions are in proteins and what we know about them. In that regard, the intro would benefit from a few more sentences giving examples of important cation-pi interactions, and/or summarizing briefly findings of the in silico studies that are mentioned.

    This suggestion is reasonable; as noted, our group's focus on membrane proteins affects the framing and discussion. That said, we do believe that the system will have broad utility. We edited the introduction and discussion to better reflect this. We also recently published a review article that details a wide range of cation-pi examples in membrane proteins (cited in this paper- Infield et al. JMB, 2021). This paper also discusses soluble proteins and examples of soluble domains of membrane proteins that have cation-pi interactions.

    1. Could the introduction of the ncAA affect GFP fluorescence? Along these same lines, could the author explain why they select residue N150 for the introduction of the ncAA?

    Thank you for this comment. Previous work has shown that fluorinated aromatic analogs do not appreciably affect fluorescence of this GFP design. We have added the relevant citation (Miyake-Stoner et al, Biochemistry, 2010) to the paper. Position N150 was chosen because it is extremely popular in the field; it has been used for dozens of studies reporting new synthetases. This enables important context when evaluating new synthetases that have been discovered.

    1. Providing the specific sequences of sfGFP-His expressed in E coli and HEKT cells, and adding the expected Dmass for N150F would help readers to better understand the intact ESI-MS data presented in the paper and it's also hard to read the labels in Fig.4.

    We added these protein sequences as a new Supplemental figure (5). We also added text on page 10 to clarify that the substitution of asparagine with phenylalanine yields a mass change of +33 Da.

    1. The author might consider citing Last et al. as it features and interesting role of Phe residues in anion selectivity in the Fluc channel (Last et al. (2017) eLife 6:e31259).

    Thank you for pointing this out; we've now mentioned this interesting study / mechanism into the introduction of the paper.

    1. Table 1, shows DG in the binding energy measurements but we don't recall seeing in the manuscript how DG was calculated. Also, we may be missing something, but the theoretical quantum calculations referenced in the text (ref 24) will give a result in DE as energy. We are also curious about the meaning of the PHE% (Table 1 as well). How was it calculated? What kind of information is it providing?

    The calculations were described under the methods section "Quantum calculations of cation pi binding potential" (at the very end).

    Phe% is a simple transformation of the data- the percentage of cation pi binding ability for a given species as compared to the native phe, which is the strongest interactor. We have added a sentence better explaining this is in the results section.

    (This is a response to peer review conducted by Biophysics Colab on version 1 of this preprint.)

  3. Consolidated peer review report (12 May 2022)

    GENERAL ASSESSMENT

    This is an interesting preprint wherein the authors report the generation of pyrrolysine-based aminoacyl-tRNA synthetases capable of incorporating fluorinated phenylalanine non-canonical amino acids (ncAA) into proteins expressed in either bacteria or mammalian cells. Synthetase evolution was directed using para-methyl tetrafluorophenylalanine as an ncAA. The authors use several screens and assays to characterize individual synthetases using superfolder GFP to measure protein expression using fluorescence and then identify the incorporated ncAA using mass spectrometry. For the two most extensively characterized synthetases, a wide array of fluorinated phenylalanine derivatives can be successfully incorporated. Contaminating incorporate of phenylalanine can be detected when attempting to incorporate several ncAAs in E. coli, but this bleed through incorporation of phenylalanine is not observed for expression in mammalian cells. In the case of monofluorinated phenylalanine for expression in mammalian cells, incorporation in place of phenylalanine is observed in other regions of GFP outside the site directed using the amber codon, making it challenging to incorporate the less heavily fluorinated ncAAs. Finally, the authors demonstrate that GFP and two membrane proteins (CFTR and Nav1.5) can be expressed at levels adequate for biochemical studies with one of the synthetases in the presence of a trifluoro phenylalanine ncAA, suggesting that the approach should be feasible for combining with many structural and biophysical approaches. Overall, this is an interesting study that generates several new synthetases that have utility for incorporation of fluorinated phenylalanine derivatives that can be used for expression in both prokaryotic and eukaryotic expression systems and that would likely be feasible for use for structural and other mechanistic biophysical studies.

    RECOMMENDATIONS

    Revisions essential for endorsement:

    1. The results and methods sections could be improved throughout by taking the space to provide the reader with a clearer conceptualization of each step in the process implemented by the authors to evolve the synthetase to incorporate fluorinated phenylalanine derivatives. What was the starting synthetase (presumably from ref 39)? Which positions were subject to random mutagenesis? Is the library a new one or one used previously? Can the authors provide the sequences for the different evolved synthetases characterized here? In the methods it is stated that 17 unique sequences were identified, but why not report what they are? Might it be worthwhile discussing how the present results compare with earlier attempts to evolve the synthetase for other ncAAs? It was also not entirely clear to us how do the positive and negative selection screens work. How many rounds of positive-negative selection have been made? Which mutations have been identified in the variants? Why do the results shown in Fig. 2B seem to disagree with some of the results shown in Fig. 2C? C10 is mentioned in the methods section but not shown in Fig. 2C. More extensive citation of prior work would help but the work will be much more accessible to the general audience if the authors explain everything conceptually in the results section and add more details to the methods section. The screening protocol used to identify synthetase variants is explained in greater detail in the methods, however the corresponding Figure (Fig 2) is not very well explained, please provide more explanation in the legend or the text about the results presented. Round 1 and Round 2 correspond to 2 different rounds of positive-negative screening? Panel C: it’s not clear what round 1 and round 2 stands for and why some of the mutants are in round 1 and why some others in round 2. Were the UP50 plots done for the” top performing synthetases” in both rounds? Please specify. Please state what the crosses mean in Fig. 2B.

    2. The authors demonstrate that they can incorporate different fluorinated phenylalanine residues into GFP and that under similar conditions two membrane proteins can be expressed at reasonable levels. What is less clear is how well the ncAAs will be incorporated into different membrane proteins. Might the procedures employed for GFP work less effectively for other proteins? The claim that the technique is widely applicable to membrane proteins would be strengthened if the authors could provide evidence for robust incorporation of ncAAs into membrane proteins, but even if this is too challenging for the time being, the authors should openly discuss problems that might be encountered or what makes them optimistic that the synthetases developed here will be effective at incorporating ncAAs into proteins beyond what they have shown for GFP.

    3. One opportunity for demonstrating robust trifluorophenylalanine incorporation into Nav1.5 might be to include functional data demonstrating that the gating properties of the channel are altered compared to control. Is the F1486 position sufficiently sensitive for a functional readout to provide at least qualitative information about the extent of ncAA incorporation? This would also demonstrate trafficking of the protein to the membrane as a functional channel. Although it is difficult to measure the intact molecular weight of hCFTR and hNaV 1.5 proteins due to their size by MS, have the authors tried LC-ESI-MS/MS analysis following enzymatic digestion? This could conceivably help to validate not only the incorporation of fluoro-Phe ncAAs, but also the site specificity of incorporation.

    Additional suggestions for the authors to consider:

    1. The sfGFP protein samples purified for intact LC-ESI-MS analysis can be used for MS/MS analysis. Most mass spectrometers have MS/MS capability. The protein sequences and the structures of F-Phe ncAAs are known. All these make the MS/MS validation applicable. Most importantly, the results would provide strong evidence of site-specific encoding of F-Phe in proteins.

    2. Although a soluble protein was used to test the synthetases, the presentation gives the impression that the ultimate goal is for use on membrane proteins. Although membrane proteins are of interest to many and to the authors, why not present it as useful for both soluble and membrane proteins? Are there any known example of cation-pi interactions mediated by Phe in soluble proteins that would be worth investigating? A more general point is the authors could provide better framing or context by discussing how important cation-pi interactions are in proteins and what we know about them. In that regard, the intro would benefit from a few more sentences giving examples of important cation-pi interactions, and/or summarizing briefly findings of the in silico studies that are mentioned.

    3. Could the introduction of the ncAA affect GFP fluorescence? Along these same lines, could the author explain why they select residue N150 for the introduction of the ncAA?

    4. Providing the specific sequences of sfGFP-His expressed in E coli and HEKT cells, and adding the expected Dmass for N150F would help readers to better understand the intact ESI-MS data presented in the paper and it's also hard to read the labels in Fig.4.

    5. The author might consider citing Last et al. as it features and interesting role of Phe residues in anion selectivity in the Fluc channel (Last et al. (2017) eLife 6:e31259).

    6. Table 1, shows DG in the binding energy measurements but we don’t recall seeing in the manuscript how DG was calculated. Also, we may be missing something, but the theoretical quantum calculations referenced in the text (ref 24) will give a result in DE as energy. We are also curious about the meaning of the PHE% (Table 1 as well). How was it calculated? What kind of information is it providing?

    REVIEWING TEAM

    Reviewed by:

    Ana I. Fernández-Mariño, Research Fellow (K.J. Swartz lab, NINDS, USA): ion channel structure and mechanism, electrophysiology and molecular biophysics

    Yan Li, Director Proteomics Core, NINDS, USA: protein mass spectrometry

    Chloé Martens, Assistant Professor, Université Libre de Bruxelles: membrane protein structural biology, membrane transport

    Kenton J. Swartz, Senior Investigator, NINDS, USA: ion channel structure and mechanisms, chemical biology and biophysics, electrophysiology and fluorescence spectroscopy

    Curated by:

    Kenton J. Swartz, Senior Investigator, NINDS, USA

    (This consolidated report is a result of peer review conducted by Biophysics Colab on version 1 of this preprint. Minor corrections and presentational issues have been omitted for brevity.)