Detecting item misfit in Rasch models

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Psychometrics in general have long relied on rule-of-thumb critical values forvarious goodness of fit metrics. With more powerful personal computers it is bothfeasible and desirable to use simulation and/or bootstrap methods to determineappropriate cutoff values. This paper illustrates and evaluates the use of an R pack-age for Rasch psychometrics that has implemented functions to simplify the pro-cess of determining simulation-based cutoff values. Through a series of simulationstudies, a comparison is made between the two methods of information-weightedconditional item fit (“infit”) and item-restscore correlations using Goodman andKruskal’s 𝛾. Results indicate the limitations of small samples (n < 500) in cor-rectly detecting item misfit due to multidimensionality, especially when a largerproportion of items are misfit and misfit items are off-target. Item outfit showsvery low performance in general. Conditional infit with simulation-based cutoffsperforms better than item-restscore with sample sizes below 500. Both methods re-sult in problematic rates of false positives with large samples (n >= 1000). Largedatasets should be analyzed using nonparametric bootstrap of subsamples withitem-restscore to reduce the risk of type-1 errors. Finally, the importance of aniterative analysis process is emphasized, since a situation where several items showunderfit will cause other items to show overfit to the Rasch model. Underfit itemsshould be removed one at a time, and a re-analysis conducted for each step to avoiderroneously eliminating items.

Article activity feed