Investigating Subscores of VERA 3 German Test Based on Item Response Theory/Multidimensional Item Response Theory Models

被引：1

作者：

Temel, Gueler Yavuz ^{[1
]}

Machunsky, Maya ^{[1
]}

Rietz, Christian ^{[1
]}

Okropiridze, Dimitry ^{[1
]}

机构：

[1] Univ Educ Heidelberg, Fac Educ & Social Sci, Heidelberg, Germany

来源：

FRONTIERS IN EDUCATION | 2022年 / 7卷

关键词：

IRT; MIRT; VERA tests; subscore; reliability; LIKELIHOOD ESTIMATION; INFORMATION; RELIABILITY; FIT; PRECISION; SCORE;

D O I：

10.3389/feduc.2022.801372

中图分类号：

G40 [教育学];

学科分类号：

040101 ; 120403 ;

摘要：

In this study, the psychometric properties of the listening and reading subtests of the German VERA 3 test were examined using Item Response Theory (IRT) and Multidimensional Item Response Theory (MIRT) models. Listening and reading subscores were estimated using unidimensional Rasch, 1PL, and 2PL models, and total scores on the German test (listening + reading) were estimated using unidimensional and multidimensional IRT models. Various MIRT models were used, and model fit was compared in a cross-validation study. The results of the study showed that unidimensional models of the reading and listening subtests and the German test provided a good overall model-data fit, however, multidimensional models of the subtests provided a better fit. The results demonstrated that, although the subtest scores also fit adequately independently, estimating the scores of the overall test with a model (e.g., bifactor) that includes a general factor (construct) in addition to the subfactors significantly improved the psychometric properties of the test. A general factor was identified that had the highest reliability values; however, the reliabilities of the specific factors were very low. In addition to the fit of the model data, the fit of the persons with IRT/MIRT models was also examined. The results showed that the proportion of person misfit was higher for the subtests than for the overall tests, but the overfit was lower. NA-German students, who did not speak German all-day, had the highest proportion of misfits with all models.

引用

页数：13

共 80 条

[21] Application of the bi-factor multidimensional item response theory model to testlet-based tests [J].

DeMars, Christine E. .

JOURNAL OF EDUCATIONAL MEASUREMENT, 2006, 43 (02) :145-168

[22] Confirming Testlet Effects [J].

DeMars, Christine E. .

APPLIED PSYCHOLOGICAL MEASUREMENT, 2012, 36 (02) :104-121

[23] APPROPRIATENESS MEASUREMENT WITH POLYCHOTOMOUS ITEM RESPONSE MODELS AND STANDARDIZED INDEXES [J].

DRASGOW, F ;

LEVINE, MV ;

WILLIAMS, EA .

BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY, 1985, 38 (MAY) :67-86

[24] The Place of the Bifactor Model in Confirmatory Factor Analysis Investigations Into Construct Dimensionality in Language Testing [J].

Dunn, Karen J. ;

McCray, Gareth .

FRONTIERS IN PSYCHOLOGY, 2020, 11

[25]

Embretson S. E., 2000, ITEM RESPONSE THEORY

[26] When Can We Improve Subscores by Making Them Shorter?: The Case Against Subscores with Overlapping Items [J].

Feinberg, Richard A. ;

Wainer, Howard .

EDUCATIONAL MEASUREMENT-ISSUES AND PRACTICE, 2014, 33 (03) :47-54

[27]

Fu J., 2018, ETS Research Report, DOI DOI 10.1002/ETS2.12203

[28] FULL-INFORMATION ITEM BIFACTOR ANALYSIS [J].

GIBBONS, RD ;

HEDEKER, DR .

PSYCHOMETRIKA, 1992, 57 (03) :423-436

[29] Bifactor Modeling and the Estimation of Model-Based Reliability in the WAIS-IV [J].

Gignac, Gilles E. ;

Watkins, Marley W. .

MULTIVARIATE BEHAVIORAL RESEARCH, 2013, 48 (05) :639-662

[30] Reporting subscores for institutions [J].

Haberman, Shelby ;

Sinharay, Sandip ;

Puhan, Gautam .

BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY, 2009, 62 :79-95

← 1 2 3 4 5 6 7 8 →