Comparing Machine and Deep Learning Models for Pediatric Anxiety Classification using Structured EHRs and Area-based Measures of Health Data

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objective

This study investigates the performance of various machine learning (ML) and deep learning (DL) models to classify pediatric patients at risk of anxiety disorders using electronic health records (EHRs). By leveraging EHR data and including Area-based measures of health (ABMH) data, this approach aims to enable proactive care by monitoring potential anxiety onset comprehensively across various age groups.

Methods

In this study, we trained a series of ML and DL models to classify youth at risk of developing anxiety disorders. ML models (Logistic Regression, Decision Tree, Random Forest, K-Nearest Neighbors, XGBoost) and DL models (LSTM, GRU, RETAIN, Dipole) were trained using structured EHR data from 30-day periods before anxiety diagnoses. Two datasets per age group were used: one with structured EHR data only and another with incorporating both structured EHR and ABMH data. Model performance was assessed using accuracy, the AUROC, AUPRC, PPV, NPV, and F1 scores.

Results

The ML models provided a solid performance baseline, with XGBoost showing strong baseline performance across age groups, with AUROC scores of 0.817 (structured EHR) and 0.816 (structured EHR + ABMH). Between DL models, RETAIN and Dipole performed the best. For example, RETAIN achieved AUROC scores of 0.851 (structured EHR) and 0.853 (structured + ABMH), while Dipole scored 0.853 and 0.857, respectively, for 8-year-olds. These results underscore the viability of both ML and DL models for the early detection of pediatric anxiety disorders.

Conclusion

This study comprehensively investigated ML and DL models for diagnosing pediatric anxiety. We demonstrated that ML and DL models can effectively monitor probable anxiety onset within an EHR system and also with the ABMH data. We discovered that model performance varied with age, indicating the need for personalized model development per age group for effective clinical predictive analytics.

Article activity feed