On the Unreliability of Test-Retest Reliability

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The Test-Retest Coefficient (TRC) is a central metric of reliability in Classical Test Theory and modern psychological assessments. Originally developed by early 19th-century psychometricians, it relies on the assumptions of fixed (i. e. perfectly stable) true scores and independent error scores. However, these assumptions are rarely, if ever, tested, even though their violation can introduce significant biases. This article explores the basis of these assumptions and examines the performance of the TRC under varying conditions, including different sample sizes, true score stability, and error score dependence. Using simulated data, results show that decreasing true score stability biases TRC estimates, leading to underestimations of reliability. Additionally, error score dependence can inflate TRC values, making unreliable measures appear reliable This study also derives new formulas for the biased estimation of the TRC, based on empirical results. These findings demonstrate that while the TRC may perform well under ideal conditions, even slight deviations from its assumptions can lead to significant inaccuracies. Therefore, the TRC may be unsuitable for practical applications, particularly when test conditions are uncontrolled, or traits vary over time.

Article activity feed