Tracking cell lineages in 3D by incremental deep learning

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    Sugawara et al describe a new interactive tool for 3D cell tracking in time that allows the user to retrain models quickly with updated labels. The utility of a tool like this for biologists is great: many experiments require tracking cell division over time or cell movements. With clear comparison to the latest developments in cellular segmentation and an improved procedure enabling the use of the tool, this paper would make an interesting contribution to the image analysis field.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #3 agreed to share their names with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Deep learning is emerging as a powerful approach for bioimage analysis. Its use in cell tracking is limited by the scarcity of annotated data for the training of deep-learning models. Moreover, annotation, training, prediction, and proofreading currently lack a unified user interface. We present ELEPHANT, an interactive platform for 3D cell tracking that addresses these challenges by taking an incremental approach to deep learning. ELEPHANT provides an interface that seamlessly integrates cell track annotation, deep learning, prediction, and proofreading. This enables users to implement cycles of incremental learning starting from a few annotated nuclei. Successive prediction-validation cycles enrich the training data, leading to rapid improvements in tracking performance. We test the software’s performance against state-of-the-art methods and track lineages spanning the entire course of leg regeneration in a crustacean over 1 week (504 timepoints). ELEPHANT yields accurate, fully-validated cell lineages with a modest investment in time and effort.

Article activity feed

  1. Author Response:

    Reviewer #1:

    Authors introduce a deep learning-based toolbox (ELEPHANT) to provide ease in annotation and tracking for 3D cells across time. The study takes two datasets (CE and PH) to demonstrate the performance of their method and compare it with two existing 3D cell tracking methods on segmentation and accuracy metrics. 3D U-Nets are shown to be performing well in segmentation tasks in recent years, authors also utilize 3D U-Net for segmenting cells as well as linking the nuclei across time through optical flow. The variation in selected datasets is shown to be in the shape, size and intensity of cells. Beyond segmentation, authors also demonstrate the performance of ELEPHANT in exploring the tracking results with and without optical flow and regenerating their fate maps. A complete server-based implementation is provided with detailed codebase and docker images to implement and utilize ELEPHANT.

    Strengths:

    The paper is technically sound with detailed explanation of each methodological step and results. 3D U-Nets are optimized for the segmentation task in hand with large training sessions, efficiency of the pipeline is nicely demonstrated which serves this as a useful toolbox for real-time annotation and prediction of cell structures. The detailed implementation on a local and remote server is presented which is a need while handling and analyzing large scale bio-imaging datasets. Beyond smoothing, SSIM-based loss is effectively applied to make the model robust against intensity and structural variations which definitely helps in generalized performance of the segmentation and tracking pipeline.

    Segmentation results are validated on a large set of nuclei and links which is helpful to understand the limitation of the models. The advantage of using optical flow-based linking is clearly shown on top of using nearest neighbors. Spatio-temporal distribution of cells on a given data guides the users in using the framework for several biological applications such as tracking the lineage of newly born cells - a hard task in stem cell engineering.

    A detailed implementation on both remote and server as well as open-source codebase on Github is well provided for the scientific community which will help the users to easily use ELEPHANT for specific datasets. Although CE and PH datasets are used to demonstrate the performance, however, similar implementation can also be performed on neuronal datasets that would be of much use in exploring neurogenesis.

    Weaknesses:

    Authors use ellipse-like shapes to annotate the data, however, many cells are not elliptic or circular in shape but consist of varying morphology. If the annotation module is equipped with drawing free annotations then it will be better useful to capture the diverse shapes of cells in both training and validation. This also limits the scope of the study to be used only for cells' datasets that are circular/elliptical in shape.

    ELEPHANT can be used to track nuclei or cells of diverse shapes. Tracking is based on reliable detection of nuclei/cells but does not require precise segmentation of their shapes. We have now added results showing that ellipsoid approximations are sufficient for detection and cell tracking, even when tracking cells with complex and variable shapes (figure 3).

    As we now explain in the manuscript (page 4), we use ellipsoids for annotation because they are essential for rapid and efficient training and predictions, which are the backbone of interactive deep learning. In practice, using ellipsoids also reduces the amount of work required for annotating the data compared with precise drawing of cell outlines. Post-processing can be appended to our workflow if a user needs to extract the precise morphology of cells.

    Authors use 3D U-net for segmentation which is a semantic segmenter, perhaps, an instance-based 3D segmenter could be a better choice to track the identity of the cells across time and space. However, an instance-based segmenter may not be ideal for segmenting the cells boundaries but a comparison between a 3D U-Net and an instance-based 3D segmenter on the same datasets will be helpful to evaluate.

    Although the original 3D U-Net is a semantic segmenter, we use its architecture to estimate the center region of cells, which works as an instance-wise detector. A similar strategy was followed by recent techniques (Kok et al. 2020, PLoS One doi:10.1101/2020.03.18.996421, Scherr et al. 2020 PLoS One doi:10.1371/journal.pone.0243219) to identify cell instances. Instance-based segmenters (e.g. StarDist, Mask R-CNN) are particularly useful for precise segmentation but our primary focus here is detection and tracking, which can be done most efficiently with the current architecture. Because StarDist or Mask R-CNN do not support sparse annotations, a direct comparison of these methods is difficult at the moment.

    The selected datasets seem to be capturing the diversity in shape and intensity, however, the biological imaging datasets in practice often have low signal to noise ratio, cell density variation and overlapping, etc. It seems like the selected datasets lack these diversities and a performance on any other data of such kind would be useful for performance evaluation as well as providing a pre-trained model for the community usage. Moreover, it would also be useful to demonstrate the performance of the framework in segmenting+tracking any 3D neuronal nuclei dataset which will broaden the scope of the study.

    The PH dataset that we used for testing ELEPHANT presents many challenges, such as variations in intensity, areas of low signal to noise ratio, densely packed and overlapping nuclei (see manuscript page 7, Suppl. Figure 5). To add to this analysis, we have now applied our method to additional datasets that show diverse characteristics – including datasets with elongated/irregular-shaped cells from the Cell Tracking Challenge (Figure 3E) and organoids imaged by light and confocal microscopy (Figure 3C,D) – demonstrating the versatility of our method. We do not think that neuronal nuclei present a particular challenge for ELEPHANT (the PH dataset includes neurons).

    We now also provide a pre-trained model, trained with diverse image datasets, which can be applied by users as a starting point for tracking on new image data.

    The 3D U-Nets are used for linking by using the difference between two consecutive images (across time) as labels. However, this technique helps to track the cell in theory but may also result in losing cell identity when cells are overlapping or when boundary features are less prominent, etc. Perhaps, a specialized deep neural network such as FlowNet3D could be a better choice here.

    Our 3D U-Net does not directly generate links across consecutive images. Instead it produces voxel-wise optical flow maps for each of the three dimensions, which are then combined with detection results to predict the position for each object (see manuscript page 6 and Methods). This is then used for linking. The identity of the tracked objects is defined during detection.

    In the end, our approach is similar to FlowNet3D in that both estimate optical flow for each detected object, although we use two consecutive images as input instead of the sets of detected objects. FlowNet3D operates only on object coordinates, without taking into account image features that could be important cues for cell tracking (e.g. fluorescence intensity of nuclei during cell division).

    Reviewer #2:

    The authors created a cell tracking tool, which they claimed was user-friendly and achieved state-of-the-art performance.

    Would a user, particularly a biologist, be able to run the code from a set of instructions clearly defined on the readme? This was not possible for me. I am not familiar with Java or Mastodon, but I'm not sure we can expect the average biologist to be familiar with these tools either. I was very impressed by the interface provided though.

    We have updated the user manual and software interface to make the software more accessible for users. Moreover, ELEPHANT is now available as an extension on Fiji, which will greatly facilitate its adoption by non-expert users.

    Did the authors achieve state-of-the-art performance? It is unclear from the paper. It would be helpful to see comparisons of this tool with modern deep learning approaches such as Stardist. Stardist for instance reports performance on the parhyale dataset in their paper. Many people in the field are combining tools like Stardist with cell tracking tools like trackmate (e.g. see https://www.biorxiv.org/content/10.1101/2020.09.22.306233v1). It would be important to know whether one can get performance comparable to Stardist (at e.g. a 0.5 IoU threshold) on a single 3D with this sparse labelling and interactive approac. I still think this approach of using sparse labelling could be very useful for transferring to novel datasets, but it is difficult to justify the framework if there is a large drop in performance compared to a fully supervised algorithm.

    The novelty in ELEPHANT is making deep learning available for cell tracking and lineaging by users who do not have extensive annotated datasets for training. Existing deep learning applications (including StarDist) do not fulfill this purpose.

    The detection and tracking scores of ELEPHANT in the Cell Tracking Challenge (identified as IGFL-FR) were the best when applied to cell lineaging on C. elegans test datasets, compared to a large number of other tracking applications (http://celltrackingchallenge.net/latest-ctb-results/). This comparison includes methods that employ deep-learning.

    ELEPHANT models trained with sparse annotation perform similarly well to trained StarDist3D models for nuclear detection in single 3D stacks (see Supplementary Figure 8). For cell tracking over time, StarDist and Trackmate have so far only been implemented in 2D.

    Reviewer #3:

    This work describes a new open source tool (ELEPHANT, https://elephant-track.github.io/) for efficient and interactive training of a deep learning based cell detection and tracking model. It uses the existing Fiji plugin Mastodon as an interactive front end (https://github.com/mastodon-sc/mastodon). Mastodon is a large-scale tracking and track-editing framework for large, multi-view images. The authors contribution is an extension of Mastodon, adding automated deep learning based cell detection and tracking. Technically, this is achieved by connecting the Mastodon as a client (written in Java) to a deep learning server (written in Python). The server can run on a different dedicated computer, capable of the GPU based computations that are needed for deep learning. This framework makes possible the detection and tracking of cells in very large volumetric data sets, within a user friendly graphical user interface.

    Strengths:

    1. It is great to reuse an existing front-end framework like Mastodon and plug in a deep learning back-end! Such software design avoids reinvention of the wheel and avoids that users need to learn too many tools.
    1. The idea to use sparse ellipsoids as annotations for cell detection is in my view fantastic as it allows very efficient annotation. This is much faster than having to paint dense 3D ground truth as is required for most deep learning algorithms.
    1. It is great that the learning is so fast that it is essentially interactive!

    Opportunities for improvements:

    The software in its current form had a view issues that made it a little hard to use. It would be great if those could be addressed in future versions.

    1. There are several options for how to set up the ELEPHANT server. In any case this requires quite some technical knowledge that may prevent adoption by a broader user base. It would thus be great if this could be further streamlined.

    We thank reviewer 3 for the very useful and detailed suggestions on improving the user interface of ELEPHANT. We have implemented most of these suggestions and we plan to pursue additional ones in future versions of the software. In brief:

    • To facilitate the setting up of the ELEPHANT server, we have implemented a control panel that allows users to monitor the process and provides links to the relevant section of the user manual and to Google Colab.
    • ELEPHANT is now available as an extension on Fiji, which will greatly facilitate its use by non-expert users.
    • Pre-trained detection and linking models, trained on diverse image datasets, are now available on the ELEPHANT github.
    • Image data can be uploaded and converted automatically via the Fiji/Mastodon interface when the image data files are missing on the server.
    1. For a GUI based software it is becoming state-of-the-art to provide recorded videos that demonstrate how to use the software. This is much more telling than written text. The authors added very nice short videos to the documentation, but I think it would be essential to also provide a longer video (ideally with voice over) where the authors demonstrate the whole workflow in one go.

    We are preparing a demo video on YouTube, which will be embedded in the user manual.

    1. As a user one interacts with the Mastodon software which sends requests to the ELEPHANT client. It would be great if the feedback for what is going on server side could be improved. For example adding progress bars and metrics for the process of the deep learning training that are visualized within Mastodon would be, in my view, very important for the usability.

    We added a log window in which users can monitor the processes that are running on the server.

  2. Evaluation Summary:

    Sugawara et al describe a new interactive tool for 3D cell tracking in time that allows the user to retrain models quickly with updated labels. The utility of a tool like this for biologists is great: many experiments require tracking cell division over time or cell movements. With clear comparison to the latest developments in cellular segmentation and an improved procedure enabling the use of the tool, this paper would make an interesting contribution to the image analysis field.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #3 agreed to share their names with the authors.)

  3. Reviewer #1 (Public Review):

    Authors introduce a deep learning-based toolbox (ELEPHANT) to provide ease in annotation and tracking for 3D cells across time. The study takes two datasets (CE and PH) to demonstrate the performance of their method and compare it with two existing 3D cell tracking methods on segmentation and accuracy metrics. 3D U-Nets are shown to be performing well in segmentation tasks in recent years, authors also utilize 3D U-Net for segmenting cells as well as linking the nuclei across time through optical flow. The variation in selected datasets is shown to be in the shape, size and intensity of cells. Beyond segmentation, authors also demonstrate the performance of ELEPHANT in exploring the tracking results with and without optical flow and regenerating their fate maps. A complete server-based implementation is provided with detailed codebase and docker images to implement and utilize ELEPHANT.

    Strengths:

    The paper is technically sound with detailed explanation of each methodological step and results. 3D U-Nets are optimized for the segmentation task in hand with large training sessions, efficiency of the pipeline is nicely demonstrated which serves this as a useful toolbox for real-time annotation and prediction of cell structures. The detailed implementation on a local and remote server is presented which is a need while handling and analyzing large scale bio-imaging datasets. Beyond smoothing, SSIM-based loss is effectively applied to make the model robust against intensity and structural variations which definitely helps in generalized performance of the segmentation and tracking pipeline.

    Segmentation results are validated on a large set of nuclei and links which is helpful to understand the limitation of the models. The advantage of using optical flow-based linking is clearly shown on top of using nearest neighbors. Spatio-temporal distribution of cells on a given data guides the users in using the framework for several biological applications such as tracking the lineage of newly born cells - a hard task in stem cell engineering.

    A detailed implementation on both remote and server as well as open-source codebase on Github is well provided for the scientific community which will help the users to easily use ELEPHANT for specific datasets. Although CE and PH datasets are used to demonstrate the performance, however, similar implementation can also be performed on neuronal datasets that would be of much use in exploring neurogenesis.

    Weaknesses:

    Authors use ellipse-like shapes to annotate the data, however, many cells are not elliptic or circular in shape but consist of varying morphology. If the annotation module is equipped with drawing free annotations then it will be better useful to capture the diverse shapes of cells in both training and validation. This also limits the scope of the study to be used only for cells' datasets that are circular/elliptical in shape.

    Authors use 3D U-net for segmentation which is a semantic segmenter, perhaps, an instance-based 3D segmenter could be a better choice to track the identity of the cells across time and space. However, an instance-based segmenter may not be ideal for segmenting the cells boundaries but a comparison between a 3D U-Net and an instance-based 3D segmenter on the same datasets will be helpful to evaluate.

    The selected datasets seem to be capturing the diversity in shape and intensity, however, the biological imaging datasets in practice often have low signal to noise ratio, cell density variation and overlapping, etc. It seems like the selected datasets lack these diversities and a performance on any other data of such kind would be useful for performance evaluation as well as providing a pre-trained model for the community usage. Moreover, it would also be useful to demonstrate the performance of the framework in segmenting+tracking any 3D neuronal nuclei dataset which will broaden the scope of the study.

    The 3D U-Nets are used for linking by using the difference between two consecutive images (across time) as labels. However, this technique helps to track the cell in theory but may also result in losing cell identity when cells are overlapping or when boundary features are less prominent, etc. Perhaps, a specialized deep neural network such as FlowNet3D could be a better choice here.

  4. Reviewer #2 (Public Review):

    The authors created a cell tracking tool, which they claimed was user-friendly and achieved state-of-the-art performance.

    Would a user, particularly a biologist, be able to run the code from a set of instructions clearly defined on the readme? This was not possible for me. I am not familiar with Java or Mastodon, but I'm not sure we can expect the average biologist to be familiar with these tools either. I was very impressed by the interface provided though.

    Did the authors achieve state-of-the-art performance? It is unclear from the paper. It would be helpful to see comparisons of this tool with modern deep learning approaches such as Stardist. Stardist for instance reports performance on the parhyale dataset in their paper. Many people in the field are combining tools like Stardist with cell tracking tools like trackmate (e.g. see https://www.biorxiv.org/content/10.1101/2020.09.22.306233v1). It would be important to know whether one can get performance comparable to Stardist (at e.g. a 0.5 IoU threshold) on a single 3D with this sparse labelling and interactive approac. I still think this approach of using sparse labelling could be very useful for transferring to novel datasets, but it is difficult to justify the framework if there is a large drop in performance compared to a fully supervised algorithm.

  5. Reviewer #3 (Public Review):

    This work describes a new open source tool (ELEPHANT, https://elephant-track.github.io/) for efficient and interactive training of a deep learning based cell detection and tracking model. It uses the existing Fiji plugin Mastodon as an interactive front end (https://github.com/mastodon-sc/mastodon). Mastodon is a large-scale tracking and track-editing framework for large, multi-view images. The authors contribution is an extension of Mastodon, adding automated deep learning based cell detection and tracking. Technically, this is achieved by connecting the Mastodon as a client (written in Java) to a deep learning server (written in Python). The server can run on a different dedicated computer, capable of the GPU based computations that are needed for deep learning. This framework makes possible the detection and tracking of cells in very large volumetric data sets, within a user friendly graphical user interface.

    Strengths:

    1. It is great to reuse an existing front-end framework like Mastodon and plug in a deep learning back-end! Such software design avoids reinvention of the wheel and avoids that users need to learn too many tools.

    2. The idea to use sparse ellipsoids as annotations for cell detection is in my view fantastic as it allows very efficient annotation. This is much faster than having to paint dense 3D ground truth as is required for most deep learning algorithms.

    3. It is great that the learning is so fast that it is essentially interactive!

    Opportunities for improvements:

    The software in its current form had a view issues that made it a little hard to use. It would be great if those could be addressed in future versions.

    1)There are several options for how to set up the ELEPHANT server. In any case this requires quite some technical knowledge that may prevent adoption by a broader user base. It would thus be great if this could be further streamlined (I shared so me specific ideas with the authors).

    1. For a GUI based software it is becoming state-of-the-art to provide recorded videos that demonstrate how to use the software. This is much more telling than written text. The authors added very nice short videos to the documentation, but I think it would be essential to also provide a longer video (ideally with voice over) where the authors demonstrate the whole workflow in one go.

    2. As a user one interacts with the Mastodon software which sends requests to the ELEPHANT client. It would be great if the feedback for what is going on server side could be improved. For example adding progress bars and metrics for the process of the deep learning training that are visualized within Mastodon would be, in my view, very important for the usability.