Did you miss me? Making the most of digital phenotyping data by imputing missingness with point process models
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Objectives
Digital phenotyping has broad clinical potential, providing low-burden objective measures of behaviour as individuals go about their lives. Digital phenotyping goals include the prediction of relapse in mental illness and improved symptom monitoring. However, progress in making clinical inferences from these data is severely challenged by the common occurrence of missing data. We propose a novel method to address this by using non-homogeneous Poisson point process models (PPPMs) to impute missing digital phenotyping data, accounting for their timeseries nature, where smartphone-based activities are modelled as ‘points’.
Methods
We demonstrate the use of PPPMs for imputing timeseries data and evaluate their influence on downstream analysis. We evaluate the inclusion of time-varying covariates to model diurnal variations in personalised PPPMs. We validate this model using participants from SMARD (n=26) in a ground truth evaluation, then in PRISM (n=65) and Hersenonderzoek studies (n=283), performing a replication analysis involving hidden Markov models (HMMs).
Results
In the ground truth evaluation, PPPMs including ‘hour of the day’ as a covariate, encoded using one-hot encoding, provided the best fit (highest out-of-sample likelihood). Using this imputation method, HMM properties such as daily rhythms were preserved and we successfully replicated findings from our prior work.
Discussion
Personalised PPPMs using covariates provide tailored simulations of behaviour that can be used for imputation in behavioural time series.
Conclusion
PPPMs using covariates are a promising imputation tool that may contribute to improved utility of digital phenotyping.