Comparative judgement without the fancy statistics

Ian Jones

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Comparative judgement methods for assessment are increasingly popular. They involve assessors making comparisons about the ‘quality’ of pairs of students’ work, and the comparisons are statistically modelled to produce scores. Recently Benton and Gallacher (2018, p.25) claimed that “much of the apparent advantage of [comparative judgement] can be explained by its use of fancy statistics”. They evidenced this by applying ‘fancy statistics’ to raw scores from multiple marked essays, and comparing the predictive value of raw scores with fancy statistics outcomes. Here I take the inverse approach and compare raw scores from comparative judgement assessments with fancy statistics outcomes. I reanalysed studies from peer-reviewed outlets where the prominent measure was based on comparative judgement. I report that raw scores reduced the reliability and validity of outcomes relative to fancy statistics in about one fifth of cases. I consider the implications of the findings for using comparative judgement in educational research.

Version published to 10.31219/osf.io/r487u_v1 on OSF Preprints
Feb 28, 2025

Comparative judgement as a research tool: a meta-analysis of application and reliability

This article has 3 authors:
1. George Kinnear
2. Ian Jones
3. Ben Davies
This article has no evaluationsLatest version Feb 6, 2025
Training with the Mastery Rubric for Statistical Literacy to promote rigor and reproducibility across scientific disciplines: Making the journal club educational

This article has 1 author:
1. Rochelle E. Tractenberg
This article has no evaluationsLatest version Mar 12, 2025
Training with the Mastery Rubric for Statistical Literacy to promote rigor and reproducibility across scientific disciplines: Making the journal club educational

This article has 1 author:
1. Rochelle E. Tractenberg
This article has no evaluationsLatest version Mar 12, 2025

Listed in

Abstract

Article activity feed

Related articles

Comparative judgement as a research tool: a meta-analysis of application and reliability

Training with the Mastery Rubric for Statistical Literacy to promote rigor and reproducibility across scientific disciplines: Making the journal club educational

Training with the Mastery Rubric for Statistical Literacy to promote rigor and reproducibility across scientific disciplines: Making the journal club educational