INTACT: A method for integration of longitudinal physical activity data from multiple sources
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Wearable devices and digital phenotyping are increasingly used in observational and interventional studies to measure real-time biosignals such as physical activity. However, integrating and comparing data across studies and cohorts remains challenging due to variability in device types, acquisition protocols, and preprocessing methods. A key challenge is removing unwanted study- or device-specific effects while preserving meaningful biological signals. These difficulties are exacerbated by the longitudinal and within-day correlations inherent in high-resolution time-varying data collected from wearable sensors. To address this, we propose INTACT (INtegration of Time-varying data from weArable sensors for physiCal acTivity), a novel method for harmonizing time-varying physical activity intensity data from accelerometers. INTACT models shared information through common eigenvalues and eigenfunctions while allowing for source-specific scale and rotation adjustments. We apply the proposed method to two real-world applications: (i) integration of accelerometer data from two waves of the National Health and Nutrition Examination Survey (NHANES), measured using different devices and reported in different units; and (ii) integration of NHANES accelerometry data with accelerometer and gyroscope measures from commercial devices. Across both applications, INTACT outperforms existing approaches in mitigating source effects while preserving biological variation, enabling more reliable cross-study comparisons of physical activity patterns.