MorphoFeatures for unsupervised exploration of cell types, tissues, and organs in volume electron microscopy
Curation statements for this article:-
Curated by eLife
eLife assessment
This paper introduces a fundamentally new automated method for assigning cell types and distinguishing organs in electron microscope (EM) reconstructions, a process that was previously manual. The authors present compelling evidence that their approach works as well or better than human efforts, in at least one species. This new method can help avoid a known bottleneck in EM reconstructions, one that will otherwise limit the ability of EM to scale up to larger volumes and target additional animal species. The main limitation is that the method has only been tested on a single species, but if tests show similar performance on other animals, the method will likely become a mainstay of EM reconstruction efforts.
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (eLife)
- Developmental Biology (eLife)
Abstract
Electron microscopy (EM) provides a uniquely detailed view of cellular morphology, including organelles and fine subcellular ultrastructure. While the acquisition and (semi-)automatic segmentation of multicellular EM volumes are now becoming routine, large-scale analysis remains severely limited by the lack of generally applicable pipelines for automatic extraction of comprehensive morphological descriptors. Here, we present a novel unsupervised method for learning cellular morphology features directly from 3D EM data: a neural network delivers a representation of cells by shape and ultrastructure. Applied to the full volume of an entire three-segmented worm of the annelid Platynereis dumerilii , it yields a visually consistent grouping of cells supported by specific gene expression profiles. Integration of features across spatial neighbours can retrieve tissues and organs, revealing, for example, a detailed organisation of the animal foregut. We envision that the unbiased nature of the proposed morphological descriptors will enable rapid exploration of very different biological questions in large EM volumes, greatly increasing the impact of these invaluable, but costly resources.
Article activity feed
-
-
eLife assessment
This paper introduces a fundamentally new automated method for assigning cell types and distinguishing organs in electron microscope (EM) reconstructions, a process that was previously manual. The authors present compelling evidence that their approach works as well or better than human efforts, in at least one species. This new method can help avoid a known bottleneck in EM reconstructions, one that will otherwise limit the ability of EM to scale up to larger volumes and target additional animal species. The main limitation is that the method has only been tested on a single species, but if tests show similar performance on other animals, the method will likely become a mainstay of EM reconstruction efforts.
-
Reviewer #1 (Public Review):
The basic problem the authors are addressing is the step of assigning cell types. They assume a segmented (divided into cells) EM volume, then try to divide the cells into similar looking cell types, based on shape and internal structure. This step has previously been entirely manual and is quite time consuming, even in well known model animals (on the fly hemibrain, this took several months of effort by a team of experts, and was the long step in getting to publication).
Their main technical advance is using machine learning to calculate a feature vector for each cell that describes the shape and ultrastructure. Importantly, this vector is shown to capture what humans consider cell types, as clusters in this vector space match both human cell typing and gene expression driven cell typing. Also important, …
Reviewer #1 (Public Review):
The basic problem the authors are addressing is the step of assigning cell types. They assume a segmented (divided into cells) EM volume, then try to divide the cells into similar looking cell types, based on shape and internal structure. This step has previously been entirely manual and is quite time consuming, even in well known model animals (on the fly hemibrain, this took several months of effort by a team of experts, and was the long step in getting to publication).
Their main technical advance is using machine learning to calculate a feature vector for each cell that describes the shape and ultrastructure. Importantly, this vector is shown to capture what humans consider cell types, as clusters in this vector space match both human cell typing and gene expression driven cell typing. Also important, both the finding of the vectors, and the clustering, can be done without human definition of any cell types (they are unsupervised) which is very helpful as sample sizes of EM volumes are increasing much faster than the human effort needed to try to classify them.
The next advance is considering the vectors from the adjoining cells to classify groups of cells into tissues. On one hand, there are many fewer tissues than cell types, and division into tissues is already done for many model organisms. On the other hand, sometimes organs are not divided by anatomical features, in which case the methods proposed here may be particularly useful. This will also be very helpful when looking at entirely unknown creatures. It would be good to test this on a case where division into organs is thought to be well known.
The authors demonstrate that the proposed techniques work well on a model organism, Platynereis dumerilii, matching or exceeding results based on human cell typing, and typing by gene expression, and uncovering previously unknown cell types and organs.
The main limitation, in my mind, is that the methods were only verified on one species. This is understandable, as EM volumes are still time consuming and expensive to acquire, but does not answer how generally applicable the methods are.
-
Reviewer #2 (Public Review):
This work provides a method for extracting morphological features of cells and their neighborhoods from EM volumes in a self-supervised manner. The authors generate these MorphoFeatures using a set of neural networks, and show the usefulness of the features for cell type classification, symmetric partner identification, and the automated clustering of cells into morphologically similar groups, tissues and organs.
The main innovation of this method compared to similar studies is the separation of the input into shape, coarse, and fine texture. A combination of an auto-encoder (for texture features) and a contrastive loss (for all features) is used to obtain features without task-specific bias. The learned features are consistent with cell type when compared to manual annotations, and genetic markers. The …
Reviewer #2 (Public Review):
This work provides a method for extracting morphological features of cells and their neighborhoods from EM volumes in a self-supervised manner. The authors generate these MorphoFeatures using a set of neural networks, and show the usefulness of the features for cell type classification, symmetric partner identification, and the automated clustering of cells into morphologically similar groups, tissues and organs.
The main innovation of this method compared to similar studies is the separation of the input into shape, coarse, and fine texture. A combination of an auto-encoder (for texture features) and a contrastive loss (for all features) is used to obtain features without task-specific bias. The learned features are consistent with cell type when compared to manual annotations, and genetic markers. The distinction between shape, coarse, and fine features is not used beyond the development of the method.
The authors later include a descriptor of the cell's neighborhood, with the goal of automatically discovering tissues and organs. Clustering in this MorphoContextFeature space successfully delineates the different parts of the *P. dumerilii* ganglia, and shows some advantages over both manual segmentation and clustering in gene expression space. A detailed analysis of the method on finding tissues in the *P. dumerilii* foregut is given as an additional example.
Strengths
The use of an unsupervised method means that this method can be applied on data where no cell types are known a priori, and the authors have made clear that a cell type classification can be obtained from MorphoFeatures with minimal annotation. Used as a first exploratory pass, this method can help quickly guide and narrow the scope of further analysis.
By separately obtaining features at three levels of resolution, the method has the potential to pinpoint the structural features most predictive of a cell type more precisely than a single-resolution method. Most interestingly, Figure 3 indicates that the learned features are visually meaningful: this would greatly increase the impact of such a method, as it would lead to testable hypotheses.
The training method that the authors suggest for the neural network is sound, and successfully avoids the potential pitfalls of using augmentation with a contrastive loss in a situation where shape is an important signal. Similarly, the authors appropriately choose clustering methods that can discover clusters of varying sizes.
The authors make good use of prior knowledge to confirm the hypotheses generated by clustering cells in MorphoFeature space. They include specific genetic explanations for both expected and unexpected clusters found (figures 5 and 6), and provide clear indication where the gene expression atlas does not give an explanation for a MorphoFeature cluster (Figure 6D). The examples given for clustering in the MorphoContextFeature space are similarly clear and well supported by additional data (figures 8 and 9).
Weaknesses
1. In the section on "visually interpretable" features, I would have liked a more quantitative idea of how many features the authors considered meaningful, and how those can be found. For example, are the six features shown in Figure 3 particularly meaningful, or were they chosen among many? A discussion of the feature selection protocol would be useful for replicating the method on new data. Furthermore, a supplementary figure with some of the features which are not meaningful would give the reader a better idea of the range of interpretability to expect.
2. The section on MorphoContextFeatures is missing a comparison with the MorphoFeatures. This made it unclear to me whether adding the neighborhood information is necessary for the discovery of tissues and organs. This could be remedied with a supplementary figure showing the same analysis as in figures 7 and 8 on the MorphoFeatures without the additional neighborhood information. Alternatively, since the MorphoFeatures are a subset of the MorphoContextFeatures, the authors could run a post-hoc analysis of whether the MorphoFeatures or the neighborhood features best explain the inter-class variance.
3. Finally, some extra guidance is needed to replicate this work on new data. In particular the following points could use more discussion:
3.1. How to choose the size of the MorphoFeatures vector - did the authors attempt a number other than 80 and if so, what was affected by this choice?
3.2. The protocol for when and how to define sub-clusters - were the chosen thresholds based on prior knowledge such as known tissues/organs? What do the authors suggest if this kind of information is missing?
3.3. How to link the obtained clusters back to specific, potentially meaningful, MorphoFeatures. For example, does the distinctive shape of the enteric neurons in cluster 8.3 of figure 5 correspond to an extreme of the cytoplasm shape feature described in figure 3 (lower left)?
-