Abstract
A theoretical advantage of item response theory (IRT) models is that trait estimates based on these models provide more test information than any other type of test score. It is still unclear, however, whether using IRT trait estimates improves external validity results in comparison with the results that can be obtained by using simple raw scores. This paper discusses some methodological results based on the 2-parameter logistic model (2PLM) and is concerned with three issues: first, how validity coefficients based on IRT trait estimates must be interpreted; second, how inferences about these coefficients can be made; and third, which differences in external validity can be expected if the 2PLM is correct for the data and IRT scores are used in place of raw scores. Four empirical examples in the personality domain provided further evidence for the results that can be expected in real research in which the model is, at best, a good approximation to the data. A general result of these examples was that validity coefficients based on IRT scores were similar to those based on raw scores.