Testing Whether Reported Treatment Effects Are Unduly Influenced by Item-Level Heterogeneity

Peter Francis Halpin
Joshua Gilbert

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This paper addresses the situation in which treatment effects are reported using educational or psychological outcome measures comprised of multiple questions or “items.” Drawing from item response theory, we distinguish among three estimands of potential interest: (a) a treatment effect on the latent variable representing the construct of interest, which is referred to as impact, (b) test-level treatment effects computed using aggregates of assessment items (e.g., the unweighted mean), and (c) item-specific effects. We show that test-level treatment effects and impact are generally not equivalent estimands in the presence of item-level treatment effect heterogeneity. Consequently, failing to distinguish these estimands can have important implications for the validity of research studies. To address this issue, we propose a diagnostic test to infer whether estimated treatment effects based on the unweighted mean of assessment items are a suitable proxy for impact on the latent trait. We illustrate the use of the test with a case study. We also provide some initial evidence about the prevalence of this issue using small meta-analysis. Results from the meta-analysis indicated that treatment effects based on the unweighted mean over assessment items often over-estimated impact on the latent trait, and that this pattern was more apparent with researcher-developed assessments than independently developed assessments.

Version published to 10.31234/osf.io/9ru45_v2 on OSF Preprints
Dec 3, 2025
Version published to 10.31234/osf.io/9ru45_v1 on OSF Preprints
Dec 1, 2025

Item-Level Heterogeneous Treatment Effects in Instrumental Variables Regression

This article has 5 authors:
1. Sanford R Student
2. Joshua Gilbert
3. Jesse Uzochukwu Eze
4. William Young
5. Benjamin Domingue
This article has no evaluationsLatest version Jan 6, 2026
Does Socially Desirable Responding Change After an Intervention? Implications for Estimating Treatment Effects

This article has 2 authors:
1. Jim Soland
2. Joshua Gilbert
This article has no evaluationsLatest version Jan 28, 2026
Heterogeneous Treatment Effect Estimation with Instrumental Variable Methods

This article has 3 authors:
1. Amir Aamodt Kazemi
2. Joseph Sexton
3. Inge Christoffer Olsen
This article has no evaluationsLatest version Jan 13, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Item-Level Heterogeneous Treatment Effects in Instrumental Variables Regression

Does Socially Desirable Responding Change After an Intervention? Implications for Estimating Treatment Effects

Heterogeneous Treatment Effect Estimation with Instrumental Variable Methods