The Sleep-Wake Classification Performance of Pediatric-Trained Machine Learning Algorithms for Raw Accelerometer Data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Introduction
Actigraphy sleep-wake classification methods increasingly seek to leverage raw acceleration data and machine-learning-based classification, but performance evaluation in pediatrics is limited. We trained machine-learning models using pediatric data and compared their sleep-wake classification performance with existing algorithms for children.
Methods
Sixty-five children (46% female, ages 5.3–17.7 years) completed in-lab overnight polysomnography and wore a GENEActiv device on their non-dominant wrist. The acceleration data were converted into 30-second epochs and aligned with physician-scored sleep-wake data from electroencephalography. Seven machine-learning models were trained using leave-one-subject-out cross-validation. Epoch-by-epoch analyses generated performance metrics (e.g., balanced accuracy [BA]) and discrepancy analyses provided overall sleep duration bias estimates. The combination of highest performance and least bias was used to rank using Euclidean distance scores - where a lower score represents closer to perfect performance and zero bias. For benchmarking, we included GGIR sleep scoring algorithms and an adult trained random forest classifier.
Results
Overall, 560.1 hours of polysomnography and actigraphy data were collected (74.4% of epochs were scored as sleep). The pediatric-trained local-global long-short term memory (LSTM) classifier had the most optimal epoch-by-epoch performance (e.g., BA=0.85, sensitivity=0.88, specificity=0.83, ROC-AUC=0.95, and Cohen’s kappa=0.67). These metrics exceeded that of an adult-trained random forest classifier and GGIR-based algorithms. Discrepancy analyses revealed that overall sleep duration was underestimated by an average of 25 minutes using the LSTM classifier with no proportional bias.
Conclusion
We trained seven pediatric sleep-wake classifiers that had strong ability to detect sleep and wake, with the LSTM classifier being most optimal.