Parametric bootstrapping of standard errors for the ability parameter in IRT models: Asymptotic and small sample analyses
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Assessing a participant’s latent ability is one of the goals of item response theory (IRT) and IRT-based computerized adaptive testing (CAT). In the literature, the commonly adopted indicator of the precision of the ability estimator in IRT (including CAT) is the Fisher information. However, this index cannot reliably reflect the precision of the estimate when the number of items is small. Two alternatives to evaluating the standard error (SE) of the ability estimator have been proposed: the Monte Carlo parametric bootstrap \parencite{Liou1991} and the ``exact SE'' method \parencite{magis2014}. The first part of this study contains some theoretical work. We start with showing that Magis's method is an instance of the parametric bootstrap. In particular, we derive the exact bootstrap distribution based on the principle of Magis's algorithm. Next, we investigate the asymptotic properties of the parametric bootstrap SE under the independent but not necessarily identically distributed condition. We prove the consistency of the parametric bootstrap SE. The asymptotic relative efficiency of it over the Fisher information SE is also established. The second part contains an improvement of the parametric-bootstrap algorithm for SE to reduce computational complexity. Using minimal sufficient statistics, we develop an accelerated version of the algorithm for the two-parameter logistic model. The last part concerns the small sample analysis. We conduct a series of simulations comparing the performance of the SE estimator from our proposed algorithm with that from the information-based method. We also illustrate an application of our algorithm on test construction using existing data with small item numbers. Overall, the results reveal an advantage of our newly developed method over the the information-based method for assessing the precision of the ability estimator in IRT.