Automated Quantification of Decreased FAF in Stargardt Disease: Validation of a Novel Method Compared to Manual Grading Standards
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Purpose
To evaluate the repeatability and reproducibility of a novel automated method compared with manual segmentation for measuring decreased autofluorescence (DAF) and definitely decreased autofluorescence (DDAF) in fundus autofluorescence (FAF) images of patients with Stargardt disease.
Design
Cross-sectional reproducibility and agreement study.
Participants
A total of 316 eyes from 158 genetically confirmed Stargardt patients were analyzed. For intra-grader repeatability, 114 FAF images were reassessed in a masked, repeated-measures design.
Methods
DAF and DDAF lesion areas were independently quantified by five certified graders using either manual delineation with Heidelberg RegionFinder or a threshold-based automated algorithm. Agreement and repeatability were assessed using intraclass correlation coefficients (ICC), standard error of measurement (SEM), minimal detectable change (MDC), Lin’s concordance correlation coefficient (CCC), Bland–Altman plots, and Passing–Bablok regression. Both raw and square-root-transformed lesion areas were evaluated.
Main Outcome Measures
Repeatability (intra-grader ICC, SEM, MDC), reproducibility (inter-grader ICC), and agreement (CCC, bias in regression analysis) between and within manual and automated methods.
Results
The automated method achieved excellent intra-grader repeatability for both DAF and DDAF (ICCs ≥0.988, SEM ≤0.71 mm², MDC ≤1.98 mm²), with minimal operator influence. Manual measurements showed variable repeatability (DAF ICCs 0.909–0.974; DDAF ICCs as low as 0.837), with square-root transformation reducing SEM and MDC. Inter-grader reproducibility was highest for automated methods (ICC = 0.989–0.992), whereas manual methods ranged from 0.764–0.939 (raw) and 0.867–0.922 (transformed). Cross-method agreement was strong (CCC = 0.91–0.96), though minor proportional and constant bias was observed in raw DAF data.
Conclusions
The automated approach provides near-perfect repeatability and high agreement with manual grading, offering a scalable, objective alternative for quantifying hypo-autofluorescent lesions in Stargardt disease. Manual methods are generally reliable but more variable, especially for DDAF, and benefit from square-root transformation.