Development and evaluation of wrist- and thigh-worn accelerometer algorithms using self-training machine learning models for classification of activity type and posture: towards device placement-agnostic methods in the ProPASS consortium
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Wearable accelerometers are widely used in health research, but differing placements (e.g. wrist vs. thigh) hinder harmonising activity classification across studies. Prior studies report 1.5- to 2.0-fold differences in physical activity level by wear location, hampering data comparability, and compromising the potential for pooling data to develop consortia and carrying out Meta- and Individual Participant Data analysis. Although supervised machine learning is increasingly used in wearables research, its reliance on extensive labelled data limits its use in free-living datasets. Semi-supervised learning offers an efficient alternative by using laboratory collected labelled data to iteratively self-train models on unlabelled free-living data. Using a self-training approach, the aim of this study was to train and evaluate algorithms for wrist- and thigh-worn devices to facilitate harmonisation of posture and activity type classification between placements.
Methods
A total of 146 participants aged 30-75 years completed either structured laboratory-based activity trials or one of two independent free-living assessments while wearing Axivity AX3 accelerometers on the wrist and thigh. For each placement, a supervised Random Forest classifier was initially trained using a labelled laboratory dataset (n=40) to classify sitting, standing, walking, running, stair climbing, and cycling - and then re-trained using self-training on free-living data (n=53, independent to the laboratory study sample). The final models were validated using another hold-out free-living independent dataset (n=53) with ground-truth activity labels obtained via direct video observation. Overall model comparison and performance was assessed using accuracy, kappa statistic, and F1 scores. Individual activity class comparison and performance was evaluated using equivalence testing, confusion matrices, and coefficient of variation between the wrist and thigh estimates.
Results
During a total of 43,800 minutes, of which 19,080 minutes were in the hold-out dataset, both self-trained models achieved high overall classification accuracy: 91.8% (SD = 6.8%) for the wrist and 95.1% (SD = 5.4%) for the thigh. The overall F1 score was 88.2 (SD = 9.6%) for the wrist classifier and 90.1 (SD = 9.3%) for the thigh classifier. Equivalence testing demonstrated that both classifiers produced activity duration estimates statistically equivalent to ground-truth for all activity types except stair climbing. Confusion matrices for the wrist demonstrated very good to excellent (88% - 97%) classification accuracy for sitting, walking, running, and cycling, and good accuracy for standing and stair climbing (71%–78%). For the thigh, classification performance was very good to excellent (83% - 98%) across sitting, standing, walking, running, and cycling, with good accuracy for stair climbing (75%). The coefficient of variation values ranged from 0.022 for running to 0.140 for standing.
Conclusion
These findings highlight the potential of self-training models to support harmonisation of wearable accelerometer data collected using different wear placements in ProPASS and other consortia. Self-training models reduce reliance on extensive labelled data and demonstrated high activity type classification accuracy for both wrist- and thigh-worn accelerometers, with a high degree of agreement and equivalence with ground-truth data across almost all activity types.