A Comparison of Machine Learning Models for Mucopolysaccharidosis Early Diagnosis: A Retrospective Cohort Study using UAE SEHA Electronic Medical Records

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Rare diseases, such as Mucopolysaccharidosis (MPS), present unique challenges to the healthcare system. Some of the most critical challenges are the delay and the lack of accurate disease diagnosis. Early diagnosis of MPS is crucial, as it has the potential to significantly improve patients' response to treatment, thereby reducing the risk of complications or death. This study aims to compare the performance of different machine learning (ML) models for MPS diagnosis using electronic health records (EHR) from the Abu Dhabi Health Services Company (SEHA). Our retrospective cohort study consists of 115 registered patients whose age <= 19 Years old from 2004 to 2022. Using nested cross-validation, we trained different feature extraction algorithms in combination with various ML algorithms, and we evaluated them using various evaluation metrics. Finally, the models with the highest performance were further interpreted using Shapley additive explanations (SHAP). We found that Naive Bayes trained on the domain expert features reported the highest performance: accuracy: 0.93 (0.08), AUC: 0.96(0.04), F1-score: 0.91(0.1), and MCC: 0.86 (0.16). The top reported features were acute pharyngitis, accretions on teeth, and body mass index pediatric, greater than or equal to the 95th percentile for age. This study offers a cost-effective screening method for MPS patients using non-invasive EHR.

Article activity feed