The Sequential Probability Ratio Test and Binary Item Response Models

被引：5

作者：

Nydick, Steven W. ^{[1
]}

机构：

[1] Univ Minnesota, Minneapolis, MN 55455 USA

来源：

JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS | 2014年 / 39卷 / 03期

关键词：

sequential probability ratio test; item response theory; computerized adaptive testing; computerized classification tests; three-parameter logistic model; STOCHASTIC CURTAILMENT; CLASSIFYING EXAMINEES; SELECTION; CATEGORIES; DESIGN; SPRT;

D O I：

10.3102/1076998614524824

中图分类号：

G40 [教育学];

学科分类号：

040101 ; 120403 ;

摘要：

The sequential probability ratio test (SPRT) is a common method for terminating item response theory (IRT)-based adaptive classification tests. To decide whether a classification test should stop, the SPRT compares a simple log-likelihood ratio, based on the classification bound separating two categories, to prespecified critical values. As has been previously noted (Spray & Reckase, 1994), the SPRT test statistic is not necessarily monotonic with respect to the classification bound when item response functions have nonzero lower asymptotes. Because of nonmonotonicity, several researchers (including Spray & Reckase, 1994) have recommended selecting items at the classification bound rather than the current ability estimate when terminating SPRT-based classification tests. Unfortunately, this well-worn advice is a bit too simplistic. Items yielding optimal evidence for classification depend on the IRT model, item parameters, and location of an examinee with respect to the classification bound. The current study illustrates, in depth, the relationship between the SPRT test statistic and classification evidence in binary IRT models. Unlike earlier studies, we examine the form of the SPRT-based log-likelihood ratio while altering the classification bound and item difficulty. These investigations motivate a novel item selection algorithm based on optimizing the expected SPRT criterion given the current ability estimate. The new expected log-likelihood ratio algorithm results in test lengths noticeably shorter than current, commonly used algorithms, and with no loss in classification accuracy.

引用

页码：203 / 230

页数：28

共 44 条

[1]

[Anonymous], 2013, R LANG ENV STAT COMP

[2]

[Anonymous], 2000, COMPUTERIZED ADAPTIV, DOI DOI 10.4324/9781410605931

[3]

[Anonymous], ED MEASUREMENT ISSUE

[4]

[Anonymous], 2001, STAT INFERENCE