Why and when you should avoid using z-scores in graphs displaying profile or group differences

Julia Moeller

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Many person-oriented studies use z-standardized scores before conducting cluster analyses and/or before displaying group differences. This article summarizes reasons why z-standardized scores can often be problematic and misleading in person-oriented methods. The article shows examples illustrating why and how the use of z-scores in group classification and comparisons can be misleading, and proposes less problematic methods. Reasons why z-standardized scores should be avoided when classifying or displaying differences between clusters, profiles, and other groups are: (1) The ratio of the difference between two groups is distorted in z-scores.(2) The ratio of the difference between two variables is distorted in z-scores.(3) Information about item endorsement and item rejection is lost.(4) The psychological meaning of a given z-score does not compare across samples and variables.(5) Group assignments can be misleading if z-scores are used to assign individuals to groups.(6) The group size and group frequency may be affected if z-scores instead of raw scores are used to assign individuals to groups.(7) Group differences in further outcome variables can change if z-scores instead of raw scores are used to assign individuals to groups.(8) Alternative normalization techniques tend to perform better than z-standardization in cluster analyses.(9) Z-standardization relies on homogeneity assumptions, including unimodality, but distributions analysed in person-oriented research are often multimodal. (10) Person-oriented methods typically examine within-person patterns to answer research questions about within-person phenomena, whereas z-standardization typically refers to between-person variation, which creates a logical mismatch between theory and method.Alternatives to using z-scores in graphs displaying profiles and group differences are using raw scores or using scale transformations that use the range, not the standard deviation in the normalization.

Version published to 10.31234/osf.io/3pf5k_v2 on OSF Preprints
Jun 2, 2025
Version published to 10.31234/osf.io/3pf5k on OSF Preprints
Apr 24, 2020

The Uniformity Fallacy: A Second Common, Severe Misinterpretation of Bar Graphs of Averages

This article has 2 authors:
1. Jeremy Bennet Wilmer
2. Sarah Horan Kerns
This article has no evaluationsLatest version May 29, 2025
A Tutorial on Estimating the Precision of Individual Test Scores

This article has 4 authors:
1. Julius M. Pfadt
2. Dylan Molenaar
3. Petra Hurks
4. Klaas Sijtsma
This article has no evaluationsLatest version Jul 7, 2025
Under my Umbrella: Rating Scales Obscure Statistical Power and Effect Size Heterogeneity

This article has 3 authors:
1. Jens Fuenderich
2. Lukas Joscha Beinhauer
3. Frank Renkewitz
This article has no evaluationsLatest version May 21, 2025

Listed in

Abstract

Article activity feed

Related articles

The Uniformity Fallacy: A Second Common, Severe Misinterpretation of Bar Graphs of Averages

A Tutorial on Estimating the Precision of Individual Test Scores

Under my Umbrella: Rating Scales Obscure Statistical Power and Effect Size Heterogeneity