Abstract
The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation study was employed in which polytomous data with five ordered categories were generated using Samejima’s graded response model under three crossed factors: sample size per group (300-, 500-, and 1,000-examinees), type of DIF (b-parameter, aparameter, and a- and b-parameter DIF), and magnitude of DIF (small and large magnitudes of DIF). The Type I error rate was inflated for IRT based tests and ordinal logistic regression when some of the items contained DIF. For the uniform DIF conditions, MACS and IRT exhibited similar power rates; however, ordinal logistic regression exhibited slightly higher power compared to the other two methods for smaller sample sizes. Lastly, for nonuniform DIF, IRT exhibited much more power compared to MACS and ordinal logistic regression.