Differential Item Functioning analysis in SABER 11. A case study for DIF in large-scale assessments

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

For every test, evidence of its validity should be examined. Absence of Differential Item Functioning (DIF) is an important piece of evidence to support the validity of group comparisons of test results. In this paper, we examine DIF between two test forms of the Mathematics test of SABER 11, a Large Scale Assessment (LSA) in Colombia. We illustrate how to tailor the process for identifying DIF following the set of guiding questions proposed by Sireci and Rios (2013, Educational Research and Evaluation, 19(2-3), 170–187) and giving answers to each of them for the analyzed test. Additionally, we present the results of a set of three simulation studies conducted to investigate the performance of the non-compensatory DIF (NCDIF) index under large sample sizes and large sample size ratios (up to 1 : 25), as well as the performance of the effect size measure guidelines (Wright and Oshima, 2015, Educational and Psychological Measurement, 75(2), 338-358) under these conditions. These simulation studies were completed due to a gap in the literature for this DIF index that obstructed the decisions required to complete the analyses of the SABER 11 tests. The results of the simulation studies allowed us to made the corresponding choices about the sample sizes to use in the analysis of SABER 11 real data and the inclusion of the effect size as part of the detection procedure. The results from the simulation studies also enlighten the performance of the NCDIF index more generally across several conditions applicable, not only to SABER 11, but possibly to other LSAs. Lastly, the results of the simulation studies also suggest that simulation studies examining the performance of NCDIF, and possibly any DIF statistic, should implement realistic item parameter pools and not only sanitized well distributed sets of item parameters.

Article activity feed