Multilevel Reliability Measures of Latent Scores Within an Item Response Theory Framework

被引：7

作者：

Cho, Sun-Joo ^{[1
]}

Shen, Jianhong ^{[2
]}

Naveiras, Matthew ^{[1
]}

机构：

[1] Vanderbilt Univ, Nashville, TN 37203 USA

[2] axialHealthcare, Nashville, TN USA

来源：

MULTIVARIATE BEHAVIORAL RESEARCH | 2019年 / 54卷 / 06期

关键词：

Bayesian analysis; item response theory; marginal maximum likelihood estimation; multilevel model; multiple imputation; reliability coefficient; MAXIMUM-LIKELIHOOD-ESTIMATION; CROSS-LEVEL MEASUREMENT; SIGNAL NOISE RATIO; MEASUREMENT INVARIANCE; INFORMATION FUNCTION; IRT; ABILITY; MODEL; PARAMETER; ACCURACY;

D O I：

10.1080/00273171.2019.1596780

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

This paper evaluated multilevel reliability measures in two-level nested designs (e.g., students nested within teachers) within an item response theory framework. A simulation study was implemented to investigate the behavior of the multilevel reliability measures and the uncertainty associated with the measures in various multilevel designs regarding the number of clusters, cluster sizes, and intraclass correlations (ICCs), and in different test lengths, for two parameterizations of multilevel item response models with separate item discriminations or the same item discrimination over levels. Marginal maximum likelihood estimation (MMLE)-multiple imputation and Bayesian analysis were employed to evaluate the accuracy of the multilevel reliability measures and the empirical coverage rates of Monte Carlo (MC) confidence or credible intervals. Considering the accuracy of the multilevel reliability measures and the empirical coverage rate of the intervals, the results lead us to generally recommend MMLE-multiple imputation. In the model with separate item discriminations over levels, marginally acceptable accuracy of the multilevel reliability measures and empirical coverage rate of the MC confidence intervals were found in a limited condition, 200 clusters, 30 cluster size, .2 ICC, and 40 items, in MMLE-multiple imputation. In the model with the same item discrimination over levels, the accuracy of the multilevel reliability measures and the empirical coverage rate of the MC confidence intervals were acceptable in all multilevel designs we considered with 40 items under MMLE-multiple imputation. We discuss these findings and provide guidelines for reporting multilevel reliability measures.

引用

页码：856 / 881

页数：26

共 95 条

[1] American Education Research Association American Psychological Association and the National Council on Measurement in Education, 2014, Standards for educational and psychological testing, V2nd ed
[2] [Anonymous], 2017, FLEXMIRT VERSION 3 5
[3] [Anonymous], 1951, Psychometrika, DOI [10.1007/bf02310555, DOI 10.1007/BF02310555]
[4] Asparouhov T., 2016, IRT in Mplus
[5] Baker F. B., 2004, Item response theory: Parameter estimation techniques
[6] ADAPTIVE EAP ESTIMATION OF ABILITY IN A MICROCOMPUTER ENVIRONMENT
BOCK, RD
MISLEVY, RJ
[J]. APPLIED PSYCHOLOGICAL MEASUREMENT, 1982, 6 (04) : 431 - 444
[7] MARGINAL MAXIMUM-LIKELIHOOD ESTIMATION OF ITEM PARAMETERS - APPLICATION OF AN EM ALGORITHM
BOCK, RD
AITKIN, M
[J]. PSYCHOMETRIKA, 1981, 46 (04) : 443 - 459
[8] Impact of Enhanced Anchored Instruction in Inclusive Math Classrooms
Bottge, Brian A.
Toland, Michael D.
Gassaway, Linda
Butler, Mark
Choo, Sam
Griffen, Ann Katherine
Ma, Xin
[J]. EXCEPTIONAL CHILDREN, 2015, 81 (02) : 158 - 175
[9] ROBUSTNESS
BRADLEY, JV
[J]. BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY, 1978, 31 (NOV) : 144 - 152
[10] The conventional wisdom about group mean scores
Brennan, RL
[J]. JOURNAL OF EDUCATIONAL MEASUREMENT, 1995, 32 (04) : 385 - 396

← 1 2 3 4 5 6 7 8 9 10 →