Multidimensional CAT Item Selection Methods for Domain Scores and Composite Scores: Theory and Applications

被引:0
作者
Lihua Yao
机构
[1] Defense Manpower Data Center,
[2] Monterey Bay,undefined
来源
Psychometrika | 2012年 / 77卷
关键词
BMIRT; CAT; domain scores; Kullback–Leibler; MCAT; multidimensional item response theory; multidimensional information; overall scores;
D O I
暂无
中图分类号
学科分类号
摘要
Multidimensional computer adaptive testing (MCAT) can provide higher precision and reliability or reduce test length when compared with unidimensional CAT or with the paper-and-pencil test. This study compared five item selection procedures in the MCAT framework for both domain scores and overall scores through simulation by varying the structure of item pools, the population distribution of the simulees, the number of items selected, and the content area. The existing procedures such as Volume (Segall in Psychometrika, 61:331–354, 1996), Kullback–Leibler information (Veldkamp & van der Linden in Psychometrika 67:575–588, 2002), Minimize the error variance of the linear combination (van der Linden in J. Educ. Behav. Stat. 24:398–412, 1999), and Minimum Angle (Reckase in Multidimensional item response theory, Springer, New York, 2009) are compared to a new procedure, Minimize the error variance of the composite score with the optimized weight, proposed for the first time in this study. The intent is to find an item selection procedure that yields higher precisions for both the domain and composite abilities and a higher percentage of selected items from the item pool. The comparison is performed by examining the absolute bias, correlation, test reliability, time used, and item usage. Three sets of item pools are used with the item parameters estimated from real live CAT data. Results show that Volume and Minimum Angle performed similarly, balancing information for all content areas, while the other three procedures performed similarly, with a high precision for both domain and overall scores when selecting items with the required number of items for each domain. The new item selection procedure has the highest percentage of item usage. Moreover, for the overall score, it produces similar or even better results compared to those from the method that selects items favoring the general dimension using the general model (Segall in Psychometrika 66:79–97, 2001); the general dimension method has low precision for the domain scores. In addition to the simulation study, the mathematical theories for certain procedures are derived. The theories are confirmed by the simulation applications.
引用
收藏
页码:495 / 523
页数:28
相关论文
共 36 条
[1]  
Chang H.-H.(1996)A global information approach to computerized adaptive testing Applied Psychological Measurement 20 213-229
[2]  
Ying Z.(2009)The maximum priority index method for severely constrained item selection in computerized adaptive testing British Journal of Mathematical and Statistical Psychology 62 369-383
[3]  
Cheng Y.(2010)Parameter estimation with small sample size: a higher-order IRT approach Applied Psychological Measurement 34 267-285
[4]  
Chang H.H.(2010)Reporting of subscores using multidimensional item response theory Psychometrika 75 331-354
[5]  
De la Torre J.(2008)A strategy for controlling item exposure in multidimensional computerized adaptive testing Educational and Psychological Measurement 68 215-232
[6]  
Hong Y.(2005)Trait parameter recovery using multidimensional computerized adaptive testing in reading and mathematics Applied Psychological Measurement 29 3-25
[7]  
Haberman J.S.(1996)Multidimensional computerized adaptive testing in a certification or licensure context Applied Psychological Measurement 20 389-404
[8]  
Sinharay S.(1992)Unidimensional calibrations and interpretations of composite traits for multidimensional tests Applied Psychological Measurement 16 279-293
[9]  
Lee Y.H.(2009)Multidimensional adaptive testing with optimal design criteria for item selection Psychometrika 74 273-296
[10]  
Ip E.H.(1997)The past and future of multidimensional item response theory Applied Psychological Measurement 21 25-36