PyGlaucoMetrics: A Weight Stacking- Based Machine Learning Approach for Glaucoma Detection Using Visual Field Data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background and Objectives: Glaucoma (GL) classification is crucial for early diagnosis and treatment, yet relying solely on stand-alone models or International Classification of Diseases (ICD) codes is insufficient due to limited predictive power and inconsistencies in clinical labeling. This study aims to improve GL classification by stacking weights using machine learning-based models. Materials and Methods: We analyzed a subset of 33,636 participants (58% female) with 340,444 visual fields (VFs) from the Mass Eye and Ear (MEE) dataset. Five clinically relevant GL detection models (LoGTS, UKGTS, Kang, HAP2_part1, and Foster) were selected to serve as base models. Two Multi-Layer Perceptron (MLP) models were trained using 52 total deviation (TD) and pattern deviation (PD) values from Humphrey Field Analyzer (HFA) 24-2 VF tests, along with four clinical variables (age, gender, follow-up time, and race) to extract model weights. These weights were then utilized to train three meta-learners including Logistic Regression (LR), Extreme Gradient Boosting (XGB), and MLP to classify cases as GL or non-GL. Results: The MLP meta-learner achieved the highest performance, with an accuracy of 96.43%, an F-score of 96.01%, and an AUC of 97.96%, while also demonstrating the lowest prediction uncertainty (0.08 ± 0.13). XGB followed with 92.86% accuracy, a 92.31% F-score, and a 96.10% AUC. LR had the lowest performance, with 89.29% accuracy, an 86.96% F-score, and a 94.81% AUC, as well as the highest uncertainty (0.58 ± 0.07). Permutation importance analysis revealed that the superior temporal sector was the most influential VF feature, with importance scores of 0.08 in Kang’s and 0.04 in HAP2_ part1 models. Among clinical variables, age was the strongest contributor (score= 0.3). Conclusion: The meta-learning approach outperformed stand-alone models in GL classification, offering a valuable tool for automated glaucoma assessment.