Discovering data-driven microbial growth models with symbolic regression

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

  • Connecting mathematical models with empirically measured microbial growth has remained challenging, as numerous competing models based on different theoretical approaches can fit observations. Therefore, we develop a method to automatically propose growth models from microbial data alone. We validate this approach using an available dataset of E. coli grown on known resources, and study sixteen species across various concentrations of a rich medium.

  • The inherently interpretable approach of symbolic regression infers explicit dynamical models directly from growth data. Using symbolic regression natively, does not favour biologically interpretable models, but we find cumulative population gain to be a more informative machine-learning feature than population size.

  • Random Forest machine learning allows us to relate this finding to the approximation of a constant-rate per capita resource consumption. This suggests that the area under the growth curve (AUC) measured in routine experiments provides information on the effective resource dynamics governing microbial growth. Finally, we use theoretical insights to inform the symbolic regression algorithm and favour biologically interpretable models.

  • Overall, we found that balancing between data fit, parsimony and biological relevance favoured both the simplest, linear approximation, and models based on a Monod dynamics, with either one or two underlying resources. Therefore, our approach to read growth laws off of microbial batch cultures in a data-driven manner provides insights on the models’ data-based parsimony.

  • Article activity feed