Clarifying the reliability paradox: poor test-retest reliability attenuates group differences

Povilas Karvelis
Andreea Oliviana Diaconescu

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Cognitive sciences are grappling with the reliability paradox: measures that robustly produce within-group effects tend to have low test-retest reliability, rendering them unsuitable for studying individual differences. Despite the growing awareness of this paradox, its full extent remains underappreciated. Specifically, most research focuses exclusively on how reliability affects correlational analyses of individual differences, while largely ignoring its effects on studying group differences. Moreover, some studies explicitly and erroneously suggest that poor reliability does not pose problems for studying group differences, possibly due to conflating within- and between-group effects. In this brief report, we aim to clarify this misunderstanding. Using both data simulations and mathematical derivations, we show how observed group differences get attenuated by measurement reliability. We consider multiple scenarios, including when groups are created based on thresholding a continuous measure (e.g., patients vs. controls or median split), when groups are defined exogenously (e.g., treatment vs. control groups, or male vs. female), and how the observed effect sizes are further affected by differences in measurement reliability and between-subject variance between the groups. We provide a set of equations for calculating attenuation effects across these scenarios. This work has important implications for biomarker research and clinical translation, as well as any other area of research that relies on group comparisons to inform policy and real-world applications.

Version published to 10.31234/osf.io/z4yqe_v4 on OSF Preprints
Mar 30, 2025
Version published to 10.31234/osf.io/z4yqe_v3 on OSF Preprints
Feb 21, 2025
Version published to 10.31234/osf.io/z4yqe_v2 on OSF Preprints
Feb 8, 2025
Version published to 10.31234/osf.io/z4yqe_v1 on OSF Preprints
May 4, 2024

Unstable Measures, Unreliable Effects: Re-evaluating Replicability with Reliability Informed Confidence Intervals

This article has 1 author:
1. Carl Weems
This article has no evaluationsLatest version Mar 31, 2025
Erroneous Generalization - Exploring Random Error Variance in Reliability Generalizations of Psychological Measurements

This article has 3 authors:
1. Lukas Joscha Beinhauer
2. Jens Fuenderich
3. Frank Renkewitz
This article has no evaluationsLatest version Mar 26, 2025
Reliability and statistical power: Conceptual background and practical implications

This article has 1 author:
1. Attila Krajcsi
This article has no evaluationsLatest version May 5, 2025

Listed in

Abstract

Article activity feed

Related articles

Unstable Measures, Unreliable Effects: Re-evaluating Replicability with Reliability Informed Confidence Intervals

Erroneous Generalization - Exploring Random Error Variance in Reliability Generalizations of Psychological Measurements

Reliability and statistical power: Conceptual background and practical implications