Probing missing data in population-based longitudinal studies: A tutorial and application using R

Hedyeh Ahmadi
Gagandeep Singh
Megan Herting

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

A common challenge in longitudinal population-based research is the amount of incomplete and missing data that occurs for failing to complete the protocol, as well as potential loss to follow-up overtime. These types of missingness in a dataset can lead to problems such as biases in parameter estimates and loss of power during statistical testing, and ultimately, interpretation of findings. Yet, few studies report key information about missingness in their sample. Moreover, while a breadth of information already exists on the types of missingness and potential methods for handling missingness, the field lacks details on how to conduct a missingness analysis in a real-world setting. In this tutorial, we illustrate key steps in handling missing data and provide an opportunity for researchers to practice appropriate steps in identifying the magnitude and patterns of missing data, as well as assess both selection and retention bias. We utilize a large publicly available longitudinal pediatric population-based study with a focus on air pollution as the primary exposure and a type of emotional health behavior, known as internalization, as the outcome. We provide the reproducible R code for researchers to be able to easily adapt to their own longitudinal observational study.

Version published to 10.31234/osf.io/ajzyh_v1 on OSF Preprints
Sep 27, 2025

Multiple Imputation of Missing Data in Longitudinal Designs: A Comparison of Different Strategies

This article has 4 authors:
1. Mark Lustig
2. Oliver Lüdtke
3. Alexander Robitzsch
4. Simon Grund
This article has no evaluationsLatest version Sep 22, 2025
Accounting for Structured Missingness in Canonical Correlation Analysis

This article has 3 authors:
1. Lav Radosavljević
2. Stephen M. Smith
3. Thomas E. Nichols
This article has no evaluationsLatest version Oct 10, 2025
Quantifying and adjusting for selection biases in the Norwegian Mother, Father and Child Cohort Study using population-wide individual-level registry information

This article has 7 authors:
1. Christopher Rayner
2. Laurie John Hannigan
3. Isabella Badini
4. Perline Demange
5. Sverre Berg Ofstad
6. Eivind Ystrom
7. Tom McAdams
This article has no evaluationsLatest version Nov 13, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Multiple Imputation of Missing Data in Longitudinal Designs: A Comparison of Different Strategies

Accounting for Structured Missingness in Canonical Correlation Analysis

Quantifying and adjusting for selection biases in the Norwegian Mother, Father and Child Cohort Study using population-wide individual-level registry information