Psychometrics is all you need

Jodi M. Casabianca

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This paper argues that the current state of AI development and evaluation should be re-imagined within a psychometric validity framework long used by the measurement and assessment community. It frames the problem of evaluating AI using the same principled structure applied to human trait evaluation. The paper presents an evidence-centered, psychometric framework that guides generative AI model applications from construct-driven design through evaluation and validation. While the literature has suggested application of psychometric methods and tests to certain components of AI evaluation, a broader, overarching approach has not been comprehensively connected to AI development and evaluation, nor have these ideas gained widespread adoption. I explicitly connect evidence-centered design models to the generative AI application pipeline to optimize alignment between intended claims, model outputs, and their evaluation. To inform evaluation plans so they support arguments for the use of AI system outputs, I propose collecting validity evidence and use a case study to demonstrate this paradigm. This paper also provides guiding questions for developers and evaluators and highlights how psychometric methods might be used for AI.

Version published to 10.35542/osf.io/7w6pz_v1 on OSF Preprints
Nov 9, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed