A generalizable brain extraction net (BEN) for multimodal MRI data from rodents, nonhuman primates, and humans

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment

    This article is a valuable contribution to the field of neuroimaging. The paper proposes a deep neural network for brain extraction that generalises across domains, including species, scanners, and MRI sequences. Although in some sense brain extraction is not a challenging problem for deep learning, domain generalisation can be. The authors provide solid evidence that their approach works though it may need to be precisely matched to the training data.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Accurate brain tissue extraction on magnetic resonance imaging (MRI) data is crucial for analyzing brain structure and function. While several conventional tools have been optimized to handle human brain data, there have been no generalizable methods to extract brain tissues for multimodal MRI data from rodents, nonhuman primates, and humans. Therefore, developing a flexible and generalizable method for extracting whole brain tissue across species would allow researchers to analyze and compare experiment results more efficiently. Here, we propose a domain-adaptive and semi-supervised deep neural network, named the Brain Extraction Net (BEN), to extract brain tissues across species, MRI modalities, and MR scanners. We have evaluated BEN on 18 independent datasets, including 783 rodent MRI scans, 246 nonhuman primate MRI scans, and 4601 human MRI scans, covering five species, four modalities, and six MR scanners with various magnetic field strengths. Compared to conventional toolboxes, the superiority of BEN is illustrated by its robustness, accuracy, and generalizability. Our proposed method not only provides a generalized solution for extracting brain tissue across species but also significantly improves the accuracy of atlas registration, thereby benefiting the downstream processing tasks. As a novel fully automated deep-learning method, BEN is designed as an open-source software to enable high-throughput processing of neuroimaging data across species in preclinical and clinical applications.

Article activity feed

  1. Author Response

    Reviewer #1 (Public Review):

    This paper proposes a 2D U-Net with attention and adaptive batchnorm modules to perform brain extraction that generalises across species. Generalisation is supported by a semi-supervised learning strategy that leverages test-time monte-carlo uncertainty to integrate the best-predicated labels into the training strategy. Monte-Carlo dropout maps also tend to align with inter-rate disagreement from manual segmentations meaning that they can realistically be used for fast QC. The networks (trained on a range of source domains) have been made publicly available, meaning that it should be relatively simple for users to apply them to their own cohorts, allowing for retraining on a very small number of labelled datasets. Overall the paper is exceptionally well written and validated, and the tool has broad application.

    We thank this reviewer very much for these encouraging and valuable comments.

    Reviewer #2 (Public Review):

    In this manuscript, the authors are proposing a generalizable solution to masking brains from medical images from multiple species. This is done via a deep learning architecture, where the key innovation is to incorporate domain transfer techniques that should allow the trained networks to work out of the box on new data or, more likely, need only a limited training set of a few segmented brains in order to become successful.

    The authors show applications of their algorithm to mice, rats, marmosets, and humans. In all cases, they were able to obtain high Dice scores (>0.95) with only a very small number of labelled datasets. Moreover, being deep-learning-based segmentation once a network has been trained is very fast.

    The promise of this work is twofold: to allow for the easy creation of brain masking pipelines in species or modalities where no such algorithms exist, and secondly to provide higher accuracy or robustness of brain masking compared to existing methods.

    I believe that the authors overstate the importance of generalizability somewhat, as masking brains is something that we can by and large do well across multiple species. This often uses specialized tools for human brains that the authors acknowledge work well, and in the usually simpler non-human (i.e. lissencephalic rodent) brains also work well using image registration or multi-atlas segmentation style techniques. So generalizability adds definite convenience but is not a game-changer.

    The key to the proposed algorithm is thus that it works better than, or at least as well as, existing tools. The authors show multiple convincing examples that this is the case even after retraining with only a few samples. Yet in those examples, the authors proposed retraining the network on even subtle acquisition changes, such as moving in field strength from 7 to 9.4T. I tried it on some T2 weighted ex-vivo and T1 weighted manganese enhanced in-vivo mouse data and found that the trained brain extraction net does not generalize well. None of the pre-trained networks provided by the authors produced reasonable masks on my data. Using their domain adaptation retraining algorithm on ~20 brains each resulted in, as promised, excellent brain segmentations. Yet even subtle changes to out-of-sample inputs degraded performance significantly. For example, one set of data with a slight intensity drop-off due to a misplaced sat band created masks that incorrectly excluded those lower intensity voxels. Similarly, training on normal brains and applying the trained algorithm to brains with stroke-induced lesions caused the lesions to be incorrectly masked. BEN thus seems to be in need of regular retraining to very precisely matched inputs. In both those examples, the usual image registration/multi-atlas segmentation approach we use for brain masking worked without needing any adaptation.

    Overall, this paper is filled with excellent ideas for a generalized brain extraction deep learning algorithm that features domain adaptation to allow easy retraining to meet different inputs, be they species or sequence types. The authors are to be highly commended for their work. Yet it appears to at the moment produce overtrained networks that are challenged by even subtle shifts in inputs, something I believe needs to be addressed for BEN to truly meet its promised potential.

    We sincerely thank the reviewer for these constructive comments. We appreciate that the article is considered to be a valuable contribution to the field of neuroimaging by providing BEN as an efficient and generalisable deep learning based tool for brain extraction. The major concern of this Reviewer is that a pretrained BEN leads to unsatisfactory performance on some external data (e.g. the reviewer’s own data), although the domain adaptation retraining algorithm on ~20 brains did lead to, as promised, excellent segmentation results. Here, we would like to emphasize that the initial version of BEN on Github was designed to reproduce the results we presented in the manuscript, not an optimized version for processing external datasets. To address this issue, we have optimized the BEN pipeline in the revised version, which is summarized as follows:

    1. Orientation detection. We found that in the original version of BEN, our training rodent images for BEN are all axial views, so it works the best on testing images of axial view. Therefore, if rodent MR images are loaded in other views (such as sagittal, coronal), the performance of BEN will degrade. To solve this issue, we have updated an orientation detection function in the BEN pipeline and automatically align other orientations to axial view, thus optimizing BEN’s performance.

    2. Performance optimization using plug-and-play functions. We have added post-processing steps to improve performance and running logs for quick inspection.

    3. Validation and tutorials. To further validate BEN’s generalization, we have evaluated BEN on two new external public ex-vivo MRI datasets (rTg4510 mouse: 25 ex-vivo scans, and C57BL/6 mouse: 15 ex-vivo scans). When only one label is used for BEN adaptation/retraining, impressive performance is achieved on both datasets, despite the fact that BEN was originally designed for in-vivo MRI data. To make the implementation transparent and give detailed guidance to users, we have prepared video tutorials on our Github/Documentation (https://github.com/yu02019/BEN#video-tutorials). Note that BEN’s performance may degenerate when dealing with MR images with low image quality. As an open-resource tool, BEN is extensible, our team will continuously maintain and update it.

    Nevertheless, there could be a couple of reasons that cause suboptimal performance when using a pretrained BEN. We discuss them below and have revised the manuscript accordingly (last paragraph in Discussion).

    On the one hand, as pointed out by the reviewer, domain generalization is a challenging task for deep learning. Although BEN could adapt to new out-of-domain images without labels (zero-shot learning) when the domain shift is relatively small (e.g. successful transfer between modalities and scanners with different MR strengths), the domain gap exists in ex-vivo MRI data used by the reviewer and in-vivo images in our training images could be so large that it compromises the performance. In this case, additional labeled data and retraining are indeed necessary for BEN to perform few-shot learning, which we have emphasized and demonstrated in our manuscript and confirmed by the reviewer (although in our opinion, it is possible we only need <5 more brains instead of 20 to complete the task).

    On the other hand, as a deep learning tool, it is difficult or nearly impossible to guarantee optimal performance on any unseen data. This is also a motivation for us to design BEN as an extensible tool. As stated in the manuscript, the source domain for BEN is flexible and does not bind to Mouse-T2-11.7T, in our manuscript. Instead, users can provide their own data and pretrained network as a new source domain, therefore facilitating domain generalization by reducing the domain gap between the new source and target domains.

  2. eLife assessment

    This article is a valuable contribution to the field of neuroimaging. The paper proposes a deep neural network for brain extraction that generalises across domains, including species, scanners, and MRI sequences. Although in some sense brain extraction is not a challenging problem for deep learning, domain generalisation can be. The authors provide solid evidence that their approach works though it may need to be precisely matched to the training data.

  3. Reviewer #1 (Public Review):

    This paper proposes a 2D U-Net with attention and adaptive batchnorm modules to perform brain extraction that generalises across species. Generalisation is supported by a semi-supervised learning strategy that leverages test-time monte-carlo uncertainty to integrate the best-predicated labels into the training strategy. Monte-Carlo dropout maps also tend to align with inter-rate disagreement from manual segmentations meaning that they can realistically be used for fast QC. The networks (trained on a range of source domains) have been made publicly available, meaning that it should be relatively simple for users to apply them to their own cohorts, allowing for retraining on a very small number of labelled datasets. Overall the paper is exceptionally well written and validated, and the tool has broad application.

  4. Reviewer #2 (Public Review):

    In this manuscript, the authors are proposing a generalizable solution to masking brains from medical images from multiple species. This is done via a deep learning architecture, where the key innovation is to incorporate domain transfer techniques that should allow the trained networks to work out of the box on new data or, more likely, need only a limited training set of a few segmented brains in order to become successful.

    The authors show applications of their algorithm to mice, rats, marmosets, and humans. In all cases, they were able to obtain high Dice scores (>0.95) with only a very small number of labelled datasets. Moreover, being deep-learning-based segmentation once a network has been trained is very fast.

    The promise of this work is twofold: to allow for the easy creation of brain masking pipelines in species or modalities where no such algorithms exist, and secondly to provide higher accuracy or robustness of brain masking compared to existing methods.

    I believe that the authors overstate the importance of generalizability somewhat, as masking brains is something that we can by and large do well across multiple species. This often uses specialized tools for human brains that the authors acknowledge work well, and in the usually simpler non-human (i.e. lissencephalic rodent) brains also work well using image registration or multi-atlas segmentation style techniques. So generalizability adds definite convenience but is not a game-changer.

    The key to the proposed algorithm is thus that it works better than, or at least as well as, existing tools. The authors show multiple convincing examples that this is the case even after retraining with only a few samples. Yet in those examples, the authors proposed retraining the network on even subtle acquisition changes, such as moving in field strength from 7 to 9.4T. I tried it on some T2 weighted ex-vivo and T1 weighted manganese enhanced in-vivo mouse data and found that the trained brain extraction net does not generalize well. None of the pre-trained networks provided by the authors produced reasonable masks on my data. Using their domain adaptation retraining algorithm on ~20 brains each resulted in, as promised, excellent brain segmentations. Yet even subtle changes to out-of-sample inputs degraded performance significantly. For example, one set of data with a slight intensity drop-off due to a misplaced sat band created masks that incorrectly excluded those lower intensity voxels. Similarly, training on normal brains and applying the trained algorithm to brains with stroke-induced lesions caused the lesions to be incorrectly masked. BEN thus seems to be in need of regular retraining to very precisely matched inputs. In both those examples, the usual image registration/multi-atlas segmentation approach we use for brain masking worked without needing any adaptation.

    Overall, this paper is filled with excellent ideas for a generalized brain extraction deep learning algorithm that features domain adaptation to allow easy retraining to meet different inputs, be they species or sequence types. The authors are to be highly commended for their work. Yet it appears to at the moment produce overtrained networks that are challenged by even subtle shifts in inputs, something I believe needs to be addressed for BEN to truly meet its promised potential.