Large Sample Confidence Intervals for Item Response Theory Reliability Coefficients

被引:16
作者
Andersson, Bjorn [1 ]
Xin, Tao [1 ]
机构
[1] Beijing Normal Univ, Collaborat Innovat Ctr Assessment Basic Educ Qual, 19 Xinjiekou Wai St, Beijing 100875, Peoples R China
基金
中国国家自然科学基金;
关键词
reliability; item response theory; confidence intervals; asymptotic variance; ASYMPTOTIC STANDARD ERRORS; POLYTOMOUS IRT MODELS; EM ALGORITHM; DISTRIBUTIONS; SCORES; TESTS;
D O I
10.1177/0013164417713570
中图分类号
G44 [教育心理学];
学科分类号
0402 ; 040202 ;
摘要
In applications of item response theory (IRT), an estimate of the reliability of the ability estimates or sum scores is often reported. However, analytical expressions for the standard errors of the estimators of the reliability coefficients are not available in the literature and therefore the variability associated with the estimated reliability is typically not reported. In this study, the asymptotic variances of the IRT marginal and test reliability coefficient estimators are derived for dichotomous and polytomous IRT models assuming an underlying asymptotically normally distributed item parameter estimator. The results are used to construct confidence intervals for the reliability coefficients. Simulations are presented which show that the confidence intervals for the test reliability coefficient have good coverage properties in finite samples under a variety of settings with the generalized partial credit model and the three-parameter logistic model. Meanwhile, it is shown that the estimator of the marginal reliability coefficient has finite sample bias resulting in confidence intervals that do not attain the nominal level for small sample sizes but that the bias tends to zero as the sample size increases.
引用
收藏
页码:32 / 45
页数:14
相关论文
共 25 条
[1]   Asymptotic Standard Errors of Observed-Score Equating With Polytomous IRT Models [J].
Andersson, Bjorn .
JOURNAL OF EDUCATIONAL MEASUREMENT, 2016, 53 (04) :459-477
[2]  
[Anonymous], 2012, Applications of item response theory to practical testing problems
[3]  
[Anonymous], 2016, R LANGUAGE ENV STAT
[4]  
[Anonymous], 1985, ITEM RESPONSE THEORY
[5]   MARGINAL MAXIMUM-LIKELIHOOD ESTIMATION OF ITEM PARAMETERS - APPLICATION OF AN EM ALGORITHM [J].
BOCK, RD ;
AITKIN, M .
PSYCHOMETRIKA, 1981, 46 (04) :443-459
[6]  
Chalmers RP, 2012, J STAT SOFTW, V48, P1
[7]   Comparison of Reliability Measures Under Factor Analysis and Item Response Theory [J].
Cheng, Ying ;
Yuan, Ke-Hai ;
Liu, Cheng .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 2012, 72 (01) :52-67
[8]  
Ferguson T.S., 2020, A Course in Large Sample Theory
[9]   How to compare scores from different depression scales: equating the Patient Health Questionnaire (PHQ) and the ICD-10-Symptom Rating (ISR) using Item Response Theory [J].
Fischer, H. Felix ;
Tritt, Karin ;
Klapp, Burghard F. ;
Fliege, Herbert .
INTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, 2011, 20 (04) :203-214
[10]   TECHNICAL GUIDELINES FOR ASSESSING COMPUTERIZED ADAPTIVE TESTS [J].
GREEN, BF ;
BOCK, RD ;
HUMPHREYS, LG ;
LINN, RL ;
RECKASE, MD .
JOURNAL OF EDUCATIONAL MEASUREMENT, 1984, 21 (04) :347-360