Evaluating Traditional, Deep Learning, and Subfield Methods for Automatically Segmenting the Hippocampus from MRI

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Given the relationship between hippocampal atrophy and cognitive impairment in various pathological conditions, hippocampus segmentation from MRI is an important task in neuroimaging. Manual segmentation, though considered the gold standard, is time-consuming and error-prone, leading to the development of numerous automatic segmentation methods. However, no study has yet independently compared the performance of traditional, deep learning-based, and hippocampal subfield segmentation methods within a single investigation. We evaluated nine automatic hippocampal segmentation methods (FreeSurfer, FastSurfer, FIRST, e2dhipseg, HippMapper, Hippodeep, FreeSurfer-Subfields, HippUnfold and HSF) across three datasets with manually segmented hippocampus labels. Performance metrics included overlap with manual labels, correlations between manual and automatic volumes, diagnostic group differentiation, and systematically located false positives and negatives. Most methods, especially deep learning-based ones, performed well on public datasets but showed more error and variability on unseen data. Many methods tended to over-segment, particularly at the anterior hippocampus border, but were able to distinguish between healthy controls, MCI, and dementia patients based on hippocampal volume. Our findings highlight the challenges in hippocampal segmentation from MRI and the need for more publicly accessible datasets with manual labels across diverse ages and pathological conditions.

Key Messages

  • We evaluated nine automatic hippocampal segmentation methods, including traditional and deep learning-based approaches, across three datasets with manually segmented hippocampus labels.

  • While deep learning-based methods perform well on public datasets, they show more error and variability on unseen data that is more reflective of a clinical population.

  • More publicly accessible datasets with manual labels are required for automatic hippocampal segmentations to be accurate and reliable, particularly for clinical populations.

  • Practitioner Points

  • Although deep learning based automatic hippocampal segmentation methods offer faster processing times—a requirement for translation to clinical practice—the lack of variance within training sets (such as sample demographics and scanner sequences) currently prevents transfer of learning to novel data, such as those acquired clinically.

  • More training data with varying demographics, scanner sequences and pathologies are required to adequately train deep learning methods to quickly, accurately and reliably segment the hippocampus for use in clinical practice.

  • Article activity feed