Resource allocation accounts for the large variability of rateyield phenotypes across bacterial strains
Curation statements for this article:
Curated by eLife
Evaluation Summary:
This study develops a resource allocation model for E. coli growing under steadystate conditions. The model describes both growth rate and yield and has been subjected to validation by comparison with a compiled data set. The manuscript addresses an important problem of interest to a wide range of investigators. At the same time, the authors would need to explore different assumptions for the housekeeping proteome fraction (phi_q) to ensure the model is robust.
(This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)
This article has been Reviewed by the following groups
Listed in
 Evaluated articles (eLife)
Abstract
Different strains of a microorganism growing in the same environment display a wide variety of growth rates and growth yields. We developed a coarsegrained model to test the hypothesis that different resource allocation strategies, corresponding to different compositions of the proteome, can account for the observed rateyield variability. The model predictions were verified by means of a database of hundreds of published rateyield and uptakesecretion phenotypes of Escherichia coli strains grown in standard laboratory conditions. We found a very good quantitative agreement between the range of predicted and observed growth rates, growth yields, and glucose uptake and acetate secretion rates. These results support the hypothesis that resource allocation is a major explanatory factor of the observed variability of growth rates and growth yields across different bacterial strains. An interesting prediction of our model, supported by the experimental data, is that high growth rates are not necessarily accompanied by low growth yields. The resource allocation strategies enabling highrate, highyield growth of E. coli lead to a higher saturation of enzymes and ribosomes, and thus to a more efficient utilization of proteomic resources. Our model thus contributes to a fundamental understanding of the quantitative relationship between rate and yield in E. coli and other microorganisms. It may also be useful for the rapid screening of strains in metabolic engineering and synthetic biology.
Article activity feed



Author Response
Reviewer 2 (Public Review):
The authors’ coarsegrained mathematical model is based upon proteome partitioning constraints. Similar models have been developed in the past, although the authors do an excellent job distinguishing their work. The interdependence among growth rate, growth yield, and carbon transport (together with the comparatively few state variables) makes the proposed model an attractive general framework for predictive metabolic engineering and strain optimization in biomanufacturing.
Strengths:
 The recognition that the constant biomass concentration (1/beta) can be used to recast the growthrate versus growth yield tradeoff in terms of a growth rate versus carbon uptake tradeoff (lines 147155, Eq. 2), and coupling of the growth and carbon uptakerates through proteome partitioning, are …
Author Response
Reviewer 2 (Public Review):
The authors’ coarsegrained mathematical model is based upon proteome partitioning constraints. Similar models have been developed in the past, although the authors do an excellent job distinguishing their work. The interdependence among growth rate, growth yield, and carbon transport (together with the comparatively few state variables) makes the proposed model an attractive general framework for predictive metabolic engineering and strain optimization in biomanufacturing.
Strengths:
 The recognition that the constant biomass concentration (1/beta) can be used to recast the growthrate versus growth yield tradeoff in terms of a growth rate versus carbon uptake tradeoff (lines 147155, Eq. 2), and coupling of the growth and carbon uptakerates through proteome partitioning, are powerful ideas. They transform the traditional (false) dichotomy of a negative correlation between growth and yield into a feasible space of growthyield combinations (e.g. Figs 2BC).
 The authors calibrate the model for E. coli (BW25113) grown in glycerol/glucose, batch/continuousculture (lines 157164), then apply the model to an impressive variety of E. coli strains. This is not typically done with semimechanistic models and elevates the authors’ approach by implying that their model is sufficientlygeneral so as to apply across strains, yet sufficientlyconstrained so as to provide quantitative predictions.
Weaknesses:
 The tension between generality and constraint leads to some category errors where strainspecific empirical invariants are taken as general strainindependent operating conditions. This happens at least twice: a minor case involving the growthrate threshold for acetate overflow, and a serious case where the magnitude of the ’housekeeping’ proteome fraction φq is taken to be strain and conditionindependent.
a) (lines 8286) The growthrate threshold for the acetate overflow switch in E. coli was observedin ’studies with a single strain in different conditions’ [i.e. different carbon sources in batch]. The interpretation provided in the references cited (lines 8384) is that the threshold is a manifestation of a tipping point between carbon uptake rate and the costs of energy generation. The carbon uptake rate is implicitly straindependent; there is no reasonable expectation that all strains growing in glucose will be fermenting (or all respiring). The conclusion (line 84) that ’the model predicted no correlation between growth rate and acetate secretion rate in the case of different strains growing in the same environment’ is tautological when the carbon uptake rate (vmc) is used by the authors to distinguish among strains. This error is easily fixed by simply changing the wording, but it serves to illustrate how constraints operating at the strain level can be tacitly (and erroneously) applied at the genus level.
The emphasis we put on the comparison between batch growth on glucose of different strains vs batch growth in different environments of a single strain may have been misleading. The point we wanted to make was that the occurrence of fermentation (acetate overflow) during fast growth on glucose is not a necessary consequence of intrinsic physical constraints on metabolism, but the consequence of strainspecific regulatory mechanisms. This is demonstrated by the existence of E. coli strains that do not ferment while growing on glucose, but that have essentially the same metabolic capacities as strains that do. When we started this study, we did expect (perhaps naively) that growth on glucose at a high rate necessarily comes with low yield due to the higher relative acetate overflow, that is, the ratio of the acetate secretion and glucose uptake rates (Supplementary Figure 4 in the revised manuscript).
In the new version of the manuscript, we have modified the analysis of the glucose uptake and acetate secretion data, by plotting them against growth rate and growth yield in separate 2D plots, as suggested by Reviewer 1. This has led to a perspective that is more in line with the comment of this reviewer that the model explores different ways in which a carbon uptake rate can be converted into a growth rate, depending on the selected resource allocation strategy, and that this gives rise to tradeoffs between growth rate and growth yield. In the context of this analysis, we do come back to the original point we wanted to make, but phrased differently (and hopefully more clearly this time).
Changes in manuscript: The comparison between batch growth on glucose of different strains and batch growth on different carbon sources of a single strain is less emphasized. We have rewritten the section and rephrased our claims accordingly throughout the paper (notably in the Abstract, Introduction, and Discussion).
b) The second example of this straingenus confusion is more serious, and perhaps is enough to unravel the model. One of the strengths of the current framework is that although there are four degrees of freedom via the proteome allocation parameters, the model is sufficientlyconstrained that the behavior can be meaningfully projected onto lowerdimensional observables like growth rate and yield (e.g. Figs 2BC).
One of the main constraints in the model that allows this meaningful projection is the assumption that the fraction of ’housekeeping’ proteins φq is constant irrespective of strain and growth conditions (line 172) and that these proteins carry flux synthesizing nonprotein macromolecules (lines 141142). Neither of these claims is supported by the references provided.
The ’housekeeping’ fraction φq was inferred in Scott et al. 2010 (line 172) from a nearlygrowthmediumindependent maximum in the RNA/protein ratio under translation limitation of strain MG1655. The magnitude of that intercept is highly straindependent and can vary nearly 2fold, especially in ALE strains. Furthermore, subsequent proteomic data (e.g. Hui et al. 2015 cited by the authors) has clarified that this ’housekeeping’ fraction is, for the most part, composed of growthrate independent offsets in the metabolic proteins.
The origin of these offsets is thought to be related to substratesaturation (Eqs. 1 and 2 of Dourado et al. 2021 cited by the authors) and consequently, these offsets (and by extension most of φq) carry no flux. Substrate saturation is perhaps at the root of the discrepancy in the Fig. 4 fits that necessitates adjustment of the catalytic constants (line 338). It is not correct to say that ’external substrate concentration S is assumed constant’ (bottom p. 25) therefore the catabolic rate vmc is an environmentdependent [i.e. substrateconcentrationindependent] parameter. The ’mc’ proteins include carbon uptake and metabolism (e.g. Fig 1, or Table 2) so that intracellular changes in S could arise from strain differences thereby affecting vmc and the magnitude of the ‘housekeeping’ fraction.
It is not clear to me how the predictive power of the model will be affected by relaxing the constant φq assumption and replacing it with the more justifiable assumption that all metabolic proteins contribute some small fraction to φq based upon substrate saturation.
The reviewer criticizes two assumptions made in the construction and analysis of the model: (i) the fraction of housekeeping proteins is constant irrespective of strain and growth conditions, and (ii) the housekeeping proteins carry flux because they synthesize macromolecules other than proteins. Below, we summarize how we have tried to clarify these assumptions and which additional work we have performed to build model variants relaxing the assumptions.
We identified the housekeeping protein category with the Qsector in the original paper of Scott et al. [13], which was misleading. The Hwa group indeed defines the Qsector as not carrying flux [7], whereas we do allow this for the housekeeping protein category. Our housekeeping protein category, which we refer to as ”other proteins” or ”residual proteins” (Mu) in the new version of the manuscript, consists of all proteins not labelled as proteins in the categories of ribosomes and translationaffiliated proteins (R), enzymes in central carbon metabolism (Mc), or enzymes in energy metabolism (Mer+Mef). Mu carries flux, because it includes (among other things) the machinery for DNA and RNA synthesis (DNA polymerase, RNA polymerase, ...). When plotting the proteome fraction of this category determined from the data of Schmidt et al. [12], we found that the fraction remains approximately constant over a large range of growth conditions. This motivated the simplifying assumption to keep the proteome fraction for Mu constant in the simulations.
The reviewer is right, however, that this may not be the case when considering a variety of E. coli strains growing on glucose, especially the strains resulting from laboratory evolution experiments. We have therefore redone the simulations while allowing the Mu category to vary, by a percentage corresponding to experimentallyobserved variations of this category over the range of growth conditions considered by Schmidt et al. [12] (Supplementary Figure 1). In comparison with the original results, the relaxation of this condition enlarges the attainable range of growth rates by about 10%, but the overall shape of the cloud of rateyield phenotypes remains the same. These new simulation results are shown in the main figures of the revised manuscript.
In parallel, we have developed a model variant that includes a Q category in the sense of Scott et al., defined by the (growthrate independent) offsets of the linear relations between growth rate and protein fractions [7]. We have retained an Mu category of other proteins in the model, interpreted as consisting of the growthrate dependent fraction of other proteins, including the molecular machinery responsible for the synthesis of other macromolecules. Whereas the Mu category carries a flux, this is not the case for the Q category. We have calibrated the model variant from the same data as the original model, and predicted the admissible rateyield phenotypes. While the cloud of predicted rateyield phenotypes is slightly displaced in comparison with the reference model, the overall qualitative shape is the same. We explain this robustness by the fact that, despite the different interpretation of the protein categories, the models are structurally very similar and calibrated from data for the same reference strain. This gives rise to different values of the catalytic constants, which compensate for the differences in protein concentrations. Note that more data are needed for the calibration of the model with the Q category, because it requires estimation of the growthrateindependent proteome fraction for all individual protein categories. In particular, in addition to carbon limitation, conditions of nitrogen and sulfur limitation are necessary [7]. In the absence of such data, additional assumptions need to be made, as we have explained in the new version of the manuscript.
We could not find a discussion of the relation between substrate saturation and growthrate independent offsets in proteomics data in the paper by Dourado et al. [2]. In the revised version of the manuscript, however, we have exploited their idea to compare substrate saturation for different predicted and observed rateyield phenotypes. As a prerequisite, this has required a refinement of the estimation of the halfsaturation constants during model calibration, for which we have used the dataset of Km values collected by Dourado et al. [2]. The finding that highrate, highyield growth comes with high substrate saturation, indicating an efficient utilization of proteomic resources, has been given more emphasis in the revised manuscript. Note that each resource allocation strategy will give rise to a different concentration of metabolites, and therefore to a different level of substrate saturation of the enzymes.
The reviewer is right that the phrase ”the external substrate concentration S is assumed constant” is not correct for batch growth, although it approximately holds for continuous growth in a chemostat. In the case of balanced growth in batch, the external substrate concentration S is much higher than the halfsaturation constant ), so that the kinetic equation for the macroreaction can be approximated by vmc = mc es, where es = kmc. In the revised manuscript, we have explicitly distinguished between these two situations (batch and continuous growth). Note that S is not the intracellular, but the extracellular concentration of substrate.
Changes in manuscript: We have better explained the meaning of the residual protein category Mu and corrected the misleading identification of this category with the Qsector of Scott et al. [13] in the section Coarsegrained model with coupled carbon and energy fluxes and in Appendix 1. In new subsections of Appendix 1 and Appendix 2, we discuss the construction and calibration of a model variant with an additional growthrate independent protein category corresponding to the Qsector of Scott et al.. In the Discussion, we explain that the rateyield predictions obtained from this model and the reference model are essentially the same, indicating the robustness of the model predictions.
We have redone all simulations using a resource allocation parameter for the housekeeping protein fraction Mu that is allowed to vary within experimentallydetermined bounds (Coarsegrained model with coupled carbon and energy fluxes and Methods). The bounds are determined from the data of Schmidt et al. [12], as shown in the new Supplementary Figure 1. These simulations also include refined estimates for the halfsaturation constants in the metabolic macroreactions.
In the final Results section, Resource allocation strategies enabling fast and efficient growth of Escherichia coli, we develop the point that higher saturation of enzymes and ribosomes is key to highrate, highyield growth of E. coli, in agreement with observations from other recent studies [2, 5, 9]. In Appendix 1, we emphasize that S is the extracellular substrate concentration and we distinguish between simplifications of vmc for batch and continuous growth.


Evaluation Summary:
This study develops a resource allocation model for E. coli growing under steadystate conditions. The model describes both growth rate and yield and has been subjected to validation by comparison with a compiled data set. The manuscript addresses an important problem of interest to a wide range of investigators. At the same time, the authors would need to explore different assumptions for the housekeeping proteome fraction (phi_q) to ensure the model is robust.
(This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)

Reviewer #1 (Public Review):
Baldazzi and coworkers propose a resource allocation model for E. coli steadystate cell growth that allows a joint description of growth rate and yield (fraction of substrate converted into biomass) and compare it with a compiled dataset based on batch growth data from different strains and two growth conditions (as well as some chemostat growth data). The model includes a description of alternative respiration and fermentation pathways with different energy efficiency. The model predicts bounds on the achievable state growth rate vs yield space that are compared with data, as well as glucose uptake and acetate secretion rates, which are compared with data.
In my view, the main merits of the model are (i) the compiled dataset of growthyielduptakesecretion parameters and (ii) the proposition of a …
Reviewer #1 (Public Review):
Baldazzi and coworkers propose a resource allocation model for E. coli steadystate cell growth that allows a joint description of growth rate and yield (fraction of substrate converted into biomass) and compare it with a compiled dataset based on batch growth data from different strains and two growth conditions (as well as some chemostat growth data). The model includes a description of alternative respiration and fermentation pathways with different energy efficiency. The model predicts bounds on the achievable state growth rate vs yield space that are compared with data, as well as glucose uptake and acetate secretion rates, which are compared with data.
In my view, the main merits of the model are (i) the compiled dataset of growthyielduptakesecretion parameters and (ii) the proposition of a resourceallocation model that includes the energy budget. Contrary to most current models in this area, the biomass includes other cellular components (DNA, RNA, metabolites, ...) in addition to proteins.
The main limitations are that the trends in the data do not emerge well and the predictions of the model are not presented in a transparent way. I believe that considerable extra work is needed in order to valorize the effort and highlight the trends in both data and model. For the data, it suffices to present more "sections" of the dataset (preferably as 2D XY plots) and more reflection on their meaning. Regarding the model, I think more effort is needed towards "breaking it open" and providing insight into why the model makes certain predictions and which ones are not trivial.

Reviewer #2 (Public Review):
The authors' coarsegrained mathematical model is based upon proteome partitioning constraints. Similar models have been developed in the past, although the authors do an excellent job distinguishing their work. The interdependence among growth rate, growth yield, and carbon transport (together with the comparatively few state variables) makes the proposed model an attractive general framework for predictive metabolic engineering and strain optimization in biomanufacturing.
Strengths:
1. The recognition that the constant biomass concentration (1/beta) can be used to recast the growth rate versus growth yield tradeoff in terms of a growth rate versus carbon uptake tradeoff (lines 147155, Eq. 2), and coupling of the growth and carbon uptakerates through proteome partitioning, are powerful ideas. They …Reviewer #2 (Public Review):
The authors' coarsegrained mathematical model is based upon proteome partitioning constraints. Similar models have been developed in the past, although the authors do an excellent job distinguishing their work. The interdependence among growth rate, growth yield, and carbon transport (together with the comparatively few state variables) makes the proposed model an attractive general framework for predictive metabolic engineering and strain optimization in biomanufacturing.
Strengths:
1. The recognition that the constant biomass concentration (1/beta) can be used to recast the growth rate versus growth yield tradeoff in terms of a growth rate versus carbon uptake tradeoff (lines 147155, Eq. 2), and coupling of the growth and carbon uptakerates through proteome partitioning, are powerful ideas. They transform the traditional (false) dichotomy of a negative correlation between growth and yield into a feasible space of growthyield combinations (e.g. Figs 2BC).2. The authors calibrate the model for E. coli (BW25113) grown in glycerol/glucose, batch/continuous culture (lines 157164), then apply the model to an impressive variety of E. coli strains. This is not typically done with semimechanistic models and elevates the authors' approach by implying that their model is sufficientlygeneral so as to apply across strains, yet sufficientlyconstrained so as to provide quantitative predictions.
Weaknesses:
1. The tension between generality and constraint leads to some category errors where strainspecific empirical invariants are taken as general strainindependent operating conditions. This happens at least twice: a minor case involving the growthrate threshold for acetate overflow, and a serious case where the magnitude of the 'housekeeping' proteome fraction phi_q is taken to be strain and conditionindependent.a. (lines 8286) The growthrate threshold for the acetate overflow switch in E. coli was observed in 'studies with a single strain in different conditions' [i.e. different carbon sources in batch]. The interpretation provided in the references cited (lines 8384) is that the threshold is a manifestation of a tipping point between carbon uptake rate and the costs of energy generation. The carbon uptake rate is implicitly straindependent; there is no reasonable expectation that all strains growing in glucose will be fermenting (or all respiring). The conclusion (line 84) that 'the model predicted no correlation between growth rate and acetate secretion rate in the case of different strains growing in the same environment' is tautological when the carbon uptake rate (nu_mc) is used by the authors to distinguish among strains. This error is easily fixed by simply changing the wording, but it serves to illustrate how constraints operating at the strain level can be tacitly (and erroneously) applied at the genus level.
b. The second example of this straingenus confusion is more serious, and perhaps is enough to unravel the model. One of the strengths of the current framework is that although there are four degrees of freedom via the proteome allocation parameters, the model is sufficientlyconstrained that the behavior can be meaningfully projected onto lowerdimensional observables like growth rate and yield (e.g. Figs 2BC).
One of the main constraints in the model that allows this meaningful projection is the assumption that the fraction of 'housekeeping' proteins phi_q is constant irrespective of strain and growth conditions (line 172) and that these proteins carry flux synthesizing nonprotein macromolecules (lines 141142). Neither of these claims is supported by the references provided.
The 'housekeeping' fraction phi_q was inferred in Scott et al. 2010 (line 172) from a nearlygrowthmediumindependent maximum in the RNA/protein ratio under translation limitation of strain MG1655. The magnitude of that intercept is highly straindependent and can vary nearly 2fold, especially in ALE strains. Furthermore, subsequent proteomic data (e.g. Hui et al. 2015 cited by the authors) has clarified that this 'housekeeping' fraction is, for the most part, composed of growthrate independent offsets in the metabolic proteins.
The origin of these offsets is thought to be related to substratesaturation (Eqs. 1 and 2 of Dourado et al. 2021 cited by the authors) and consequently, these offsets (and by extension most of phi_q) carry no flux. Substrate saturation is perhaps at the root of the discrepancy in the Fig. 4 fits that necessitates adjustment of the catalytic constants (line 338). It is not correct to say that 'external substrate concentration S is assumed constant' (bottom p. 25) therefore the catabolic rate nu_mc is an environmentdependent [i.e. substrateconcentrationindependent] parameter. The 'mc' proteins include carbon uptake and metabolism (e.g. Fig 1, or Table 2) so that intracellular changes in S could arise from strain differences thereby affecting nu_mc and the magnitude of the `housekeeping' fraction.
It is not clear to me how the predictive power of the model will be affected by relaxing the constant phi_q assumption and replacing it with the more justifiable assumption that all metabolic proteins contribute some small fraction to phi_q based upon substrate saturation.
