Obtaining a common scale for item response theory item parameters using separate versus concurrent estimation in the common-item equating design

被引:130
作者
Hanson, BA [1 ]
Béguin, AA [1 ]
机构
[1] CTB, Monterey, CA 93940 USA
关键词
equating; item parameter estimation; item response theory; concurrent calibration;
D O I
10.1177/0146621602026001001
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
Item response theory item parameters can be estimated using data from a common-item equating design either separately for each form or concurrently across forms. This paper reports the results of a simulation study of separate versus concurrent item parameter estimation. Using simulated data from a test with 60 dichotomous items, four factors were considered: (a) estimation program (MULTILOG versus BILOG-MG), (b) sample size per form (3,000 versus 1,000), (c) number of common items (20 versus 10), and (d) equivalent versus nonequivalent groups taking the two forms (no mean difference versus a mean difference of I SD). In addition, four methods of item parameter scaling were used in the separate estimation condition: two item characteristic curve methods (Stocking-Lord and Haebara) and two moment methods (Mean/Mean and Mean/Sigma). Concurrent estimation generally resulted in lower error than separate estimation, although not universally so. The results suggest that one factor accounting for the lower error when using concurrent estimation may be that the parameter estimates for the common item parameters are based on larger samples. It is argued that the results of this study, together with other research on this topic, are not sufficient to recommend completely avoiding separate estimation in favor of concurrent estimation.
引用
收藏
页码:3 / 24
页数:22
相关论文
共 13 条
[1]  
ACT, 1997, ACT ASS TECHN MAN
[2]  
BEGUIN AA, 2000, ANN M NAT COUNC MEAS
[3]  
HAEBARA T, 1980, JPN PSYCHOL RES, V22, P144, DOI 10.4992/psycholres1954.22.144
[4]   A comparison of linking and concurrent calibration under item response theory [J].
Kim, SH ;
Cohen, AS .
APPLIED PSYCHOLOGICAL MEASUREMENT, 1998, 22 (02) :131-143
[5]  
Kolen M. J., 1995, Test equating methods and practices, DOI DOI 10.1007/978-1-4757-2412-7
[6]  
LORD FM, 1980, APPL ITEM RESPONSE P
[7]  
Mislevy R. J., 1990, BILOG 3
[8]  
Petersen N.S., 1983, J ED STAT, V8, P137, DOI DOI 10.2307/1164922
[9]   DEVELOPING A COMMON METRIC IN ITEM RESPONSE THEORY [J].
STOCKING, ML ;
LORD, FM .
APPLIED PSYCHOLOGICAL MEASUREMENT, 1983, 7 (02) :201-210
[10]  
Thissen D, 1991, MULTILOG USERS GUIDE