Independent Evaluation of Deep Learning Models for Detecting Focal Cortical Dysplasia

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objective

The objective of this study is to perform an independent assessment of the diagnostic utility of three state-of-the-art tools for the detection of focal cortical dysplasia (FCD) from Magnetic Resonance images (MRI). These tools include DeepFCD, the Multi-center Epilepsy Lesion Detection (MELD) Classifier, and the MELDGraph.

Methods

T1-weighted and fluid-attenuated inversion recovery MR images from 101 epilepsy patients with FCD and 101 age- and sex-matched epilepsy patients without FCD were included. Classifiers were evaluated at a patient-level by their ability to correctly identify the presence of any FCD lesions, and at a lesion-level by their capacity to identify lesions within the region delineated by the neuroradiologist in the MRI report. A calibrated threshold for DeepFCD prediction probabilities was empirically determined to improve classifier specificity. Test-retest consistency of the classifiers was measured using the Dice coefficient on repeated MRI scans of 21 individuals.

Results

For assessments at patient-level, high false positive rates were prominent, with the MELDClassifier achieving 52% accuracy (sensitivity=91%, specificity=14%). MELDGraph performed with accuracy up to 61% (sensitivity=76%, specificity=47%) and DeepFCD reached 56% accuracy (sensitivity=62%, specificity=50%) at an empirically determined threshold of 0.90. When investigating specific lesions, the MELDClassifier performed with a sensitivity of 91% and positive predictive value (PPV) of 13%, and MELDGraph performed with a sensitivity of 69% and PPV of 36%, whereas the DeepFCD achieved a sensitivity of 100% and PPV of 4%. Test-retest reliability was low, with an average [min, max] Dice coefficient of 0.28 [0.0, 1.0] for MELDClassifier, 0.38 [0.0, 1.0] for MELDGraph with harmonization and 0.35 [0.05, 0.54] for DeepFCD.

Significance

This study highlights the current limitations of using deep learning models in FCD diagnosis and emphasizes the need to enhance the tools’ accuracy, reliability, and interpretability to improve their clinical utility in epilepsy diagnosis.

Key points

  • -

    State-of-the-art deep learning tools for identifying focal cortical dysplasia perform with high sensitivity ranging from 69% to 100%

  • -

    When predicting the presence of any lesion within epilepsy patients’ MRI scans, the classifiers performed with accuracies ranging from 52% to 61%

  • -

    The average false positive count per patient ranged from 0.49 ± 0.66 (MELDGraph) to 32.71 ± 14.35 (DeepFCD).

  • -

    All three classifiers had low test-retest consistency, suggesting that the predictions may be strongly influenced by the noise in the images.

  • Article activity feed