Are Medical School Preclinical Tests Biased for Sex and Race? A Differential Item Functioning Analysis

Esther Dale
Mohammed A. A. Abulela
Hao Jia
Claudio Violato

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background: There has been recently a common practice in assessment development to ascertain that test items function equally across test-takers’ subgroups, which is fundamental for fairness, and consequently validity of test score interpretations and uses. Accordingly, we conducted differential item functioning (DIF) analysis for three preclinical medical school foundational courses based on students’ sex and race. Methods: The sample included 520, 519, and 344 medical students for anatomy, histology, and physiology, respectively, collected from 2018-2020. To conduct DIF analysis, we used the IRTPRO software based on the item response theory two-parameter logistic model. Results : The three assessments had as many as one-fifth of the items that functioned differentially across one or more of the variables sex and race: 10 (20%) out of 49 items, six (15%) out of 40 items, 5 (11%) out of 45 items showed statistically significant DIF for Anatomy , Histology , and Physiology courses, respectively. Measurement specialists and subject matter experts independently reviewed the items to identify construct-irrelevant factors as potential sources for DIF. Most identified items were generally poorly written or had unclear images. Conclusions: The validity of score-based inferences, particularly for subgroup comparisons, requires test items to function equally across examinee subgroups. In the present study, we found DIF of some items for sex and race in three content areas. The present approach should be explored in other medical schools to address the generalizability of the present findings. Item level DIF should also be routinely conducted as part of psychometric analyses for basic sciences courses and other assessments.

Version published to 10.21203/rs.3.rs-4087802/v1 on Research Square
Mar 14, 2024

Differential Item Functioning analysis in SABER 11. A case study for DIF in large-scale assessments

This article has 3 authors:
1. Victor Hernando Cervantes
2. Alexander Calderon
3. Nelson Rodriguez
This article has no evaluationsLatest version Mar 26, 2024
The Myers-Briggs Type Indicator Association with United States Medical Student Performance, Demographics, and Career Values

This article has 3 authors:
1. Henry Krasner
2. Leah Yim
3. Edward Simanton
This article has no evaluationsLatest version Mar 12, 2024
Early Childhood Measurement Invariance of the Strengths and Difficulties Questionnaire Across Age, Race, Sex, and Socioeconomic Status

This article has 4 authors:
1. Alyssa R Palmer
2. Isabella Stallworthy
3. Meriah Lee DeJoseph
4. Daniel Berry
This article has no evaluationsLatest version Apr 29, 2024

Listed in

Abstract

Article activity feed

Related articles

Differential Item Functioning analysis in SABER 11. A case study for DIF in large-scale assessments

The Myers-Briggs Type Indicator Association with United States Medical Student Performance, Demographics, and Career Values

Early Childhood Measurement Invariance of the Strengths and Difficulties Questionnaire Across Age, Race, Sex, and Socioeconomic Status