Training Generalized Segmentation Networks with Real and Synthetic Cryo-ET data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Deep learning excels at segmenting objects within noisy cryo-electron tomograms, but the approach is typically bottlenecked by access to ground truth training data. To address this issue we have developed CryoTomoSim (CTS), an open-source software package that builds coarse-grained models of macromolecular complexes embedded in vitreous ice and then simulates transmitted electron tilt series for tomographic reconstruction. Using CTS outputs, we demonstrate the effects of key microscope parameters (dose, defocus, and pixel size) on deep learning-based segmentation, and show that including both molecular crowding and diversity within synthetic datasets is key to training cellular segmentation networks from purely synthetic inputs. While very effective as initial models, the accuracy of these networks is currently limited, and real cellular data is necessary to train the most accurate and generalizable U-Nets. Using a co-training approach, we first segment over 100 tomograms from neuronal growth cones to quantify their cytoskeletal distributions and then we build a generalized cellular cryo-ET segmentation network called NeuralSeg that can segment a subset of cellular features in tomograms from all domains of life.