Data-Driven Early Prediction of Cerebral Palsy Using AutoML and interpretable kinematic features

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Early identification of cerebral palsy (CP) remains a major challenge due to the reliance on expert assessments that are time-intensive and not scalable. Consequently, a range of studies have aimed at using machine learning to predict CP scores based on motion tracking,e.g. from video data. These studies generally predict clinical scores which are a proxy for CP risk. However, clinicians do not REALLY want to estimate scores, they want to estimate the patients’ risk of developing clinical symptoms. Here we present a data-driven machine-learning (ML) pipeline that extracts movement features from infant video based motion tracking and estimates CP risk using AutoML. Using AutoSklearn, our framework minimizes risk of overfitting by abstracting away researcher-driver hyperparameter optimization. Trained on movement data from 3- to 4-month-old infants, our classifier predicts a highly indicative clinical score (General Movements Assessment [GMA]) with an ROC-AUC of 0.78 on a held-out test set, indicating that kinematic movement features capture clinically relevant variability. Without retraining, the same model predicts the risk of cerebral palsy outcomes at later clinical follow-ups with an ROC-AUC of 0.74, demonstrating that early motor representations generalize to long-term neurodevelopmental risk. We employ pre-registered lock-box validation to ensure rig-orous performance evaluation. This study highlights the potential of AutoML-powered movement analytics for neurodevelopmental screening, demonstrating that data-driven feature extraction from movement trajectories can provide an interpretable and scalable approach to early risk assessment. By integrating pre-trained vision transformers, AutoML-driven model selection, and rigorous validation protocols, this work advances the use of video-derived movement features for scalable, data-driven clinical assessment, demonstrating how computational methods based on readily available data like infant videos can enhance early risk detection in neurodevelopmental disorders.

CCS Concepts

  • Computing methodologies → Machine learning approaches ;

  • Applied computing → Health informatics .

ACM Reference Format

Melanie Segado, Laura Prosser, Andrea F. Duncan, Michelle J. Johnson, and Konrad P. Kording.. Data-Driven Early Prediction of Cerebral Palsy Using AutoML and interpretable kinematic features. In. ACM, New York, NY, USA, 8 pages.

Article activity feed