Did you miss me? Making the most of digital phenotyping data by imputing missingness with point process models

Imogen E. Leaning
Andrea Costanzo
Raj Jagesar
Loran Knol
Sarah Tjeerdsma
Anna Tyborowska
Nessa Ikani
Lianne M Reus
Pieter Jelle Visser
Martien J.H. Kas
Christian F. Beckmann
Henricus G. Ruhé
Andre F. Marquand

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Objectives

Digital phenotyping has broad clinical potential, providing low-burden objective measures of behaviour as individuals go about their lives. Digital phenotyping goals include the prediction of relapse in mental illness and improved symptom monitoring. However, progress in making clinical inferences from these data is severely challenged by the common occurrence of missing data. We propose a novel method to address this by using non-homogeneous Poisson point process models (PPPMs) to impute missing digital phenotyping data, accounting for their timeseries nature, where smartphone-based activities are modelled as ‘points’.

Methods

We demonstrate the use of PPPMs for imputing timeseries data and evaluate their influence on downstream analysis. We evaluate the inclusion of time-varying covariates to model diurnal variations in personalised PPPMs. We validate this model using participants from SMARD (n=26) in a ground truth evaluation, then in PRISM (n=65) and Hersenonderzoek studies (n=283), performing a replication analysis involving hidden Markov models (HMMs).

Results

In the ground truth evaluation, PPPMs including ‘hour of the day’ as a covariate, encoded using one-hot encoding, provided the best fit (highest out-of-sample likelihood). Using this imputation method, HMM properties such as daily rhythms were preserved and we successfully replicated findings from our prior work.

Discussion

Personalised PPPMs using covariates provide tailored simulations of behaviour that can be used for imputation in behavioural time series.

Conclusion

PPPMs using covariates are a promising imputation tool that may contribute to improved utility of digital phenotyping.

Version published to 10.1101/2025.05.13.25327521 on medRxiv
May 13, 2025

Missing Data in OHCA Registries: How Multiple Imputation Methods Affect Research Conclusions—Paper II

This article has 4 authors:
1. Stella Jinran Zhan
2. Seyed Ehsan Saffari
3. Marcus Eng Hock Ong
4. Fahad Javaid Siddiqui
This article has no evaluationsLatest version Jan 16, 2026
Evaluating Imputation Methods for Handling Missing Data in Complex Survey Designs: Evidence from the India DHS 2017–18

This article has 6 authors:
1. Mahfuzer Rohman
2. Md Sabbir Hossain
3. Md Fakrul Islam
4. Prosenjit Basak Arka
5. Md Rafi Hasan
6. Md Jamal Uddin
This article has no evaluationsLatest version Jan 23, 2026
Missing Data in Intensive Longitudinal Suicide Research: A Monte Carlo simulation study

This article has 1 author:
1. Aleksandr Karnick
This article has no evaluationsLatest version Feb 4, 2026

Discuss this preprint

Listed in

Abstract

Objectives

Methods

Results

Discussion

Conclusion

Article activity feed

Related articles

Missing Data in OHCA Registries: How Multiple Imputation Methods Affect Research Conclusions—Paper II

Evaluating Imputation Methods for Handling Missing Data in Complex Survey Designs: Evidence from the India DHS 2017–18

Missing Data in Intensive Longitudinal Suicide Research: A Monte Carlo simulation study