Representation learning of single-cell time-series with deep variational autoencoders
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Single-cell technologies have led to many insights on the function of individual cells within populations. Time-series data, in particular, have become increasingly adopted to uncover the molecular basis of heterogeneity in processes such as gene regulation, growth adaptation, and drug resistance. Yet the analysis of single-cell trajectories often requires manual and application-dependent techniques to extract meaningful features that correlate with the biological processes of interest. Here, we employ representation learning to automatically encode time-series into a low-dimensional feature space that can be used for downstream prediction tasks. We trained deep variational autoencoders on cell length data from Escherichia coli cells growing in a “mother machine” device that allows tracking single cells over long periods. We show that the learned representations preserve the structure of the data across various growth media. Using the pretrained model in tandem with supervised learning, we encoded fresh time-series from cells exposed to single and combination antibiotics, which achieved excellent classification accuracy across drug treatments. Moreover, we demonstrate that the learned embeddings can be used to accurately classify datasets from different laboratories and growth conditions without retraining. This demonstrates that the autoencoder extracts meaningful information with promising generalization power. Our results highlight the potential of representation learning for the analysis of single-cell responses under chemical perturbations and growth conditions.