Unfolding the phenomenon of interrater agreement: a multicomponent approach for in-depth examination was proposed

被引：7

作者：

Slaug, Bjorn ^{[1
]}

Schilling, Oliver ^{[2
]}

Helle, Tina ^{[1
,3
]}

Iwarsson, Susanne ^{[1
]}

Carlsson, Gunilla ^{[1
]}

Brandt, Ase ^{[4
]}

机构：

[1] Lund Univ, Dept Hlth Sci, Fac Med, SE-22100 Lund, Sweden

[2] Heidelberg Univ, Dept Psychol Ageing Res, Inst Psychol, D-69115 Heidelberg, Germany

[3] Univ Coll No Jutland, Dept Occupat Therapy, DK-9100 Aalborg, Denmark

[4] Natl Board Social Serv, Resource Ctr Handicap Assist Technol & Social Psy, DK-5000 Odense, Denmark

来源：

JOURNAL OF CLINICAL EPIDEMIOLOGY | 2012年 / 65卷 / 09期

关键词：

Interrater; Reliability; Agreement; Kappa; Methodology; Recommendations; RELIABILITY; KAPPA;

D O I：

10.1016/j.jclinepi.2012.02.016

中图分类号：

R19 [保健组织与事业（卫生事业管理）];

学科分类号：

摘要：

Objective: The overall objective was to unfold the phenomenon of interrater agreement: to identify potential sources of variation in agreement data and to explore how they can be statistically accounted for. The ultimate aim was to propose recommendations for in-depth examination of agreement to improve the reliability of assessment instruments. Study Design and Setting: Using a sample where 10 rater pairs had assessed the presence/absence of 188 environmental barriers by a systematic rating form, a raters x items data set was generated (N = 1,880). In addition to common agreement indices, relative shares of agreement variation were calculated. Multilevel regression analysis was carried out, using rater and item characteristics as predictors of agreement variation. Results: Following a conceptual decomposition, the agreement variation was statistically disentangled into relative shares. The raters accounted for 6-11%, the items for 32-33%, and the residual for 57-60% of the variation. Multilevel regression analysis showed barrier prevalence and raters' familiarity with using standardized instruments to have the strongest impact on agreement. Conclusion: Supported by a conceptual analysis, we propose an approach of in-depth examination of agreement variation, as a strategy for increasing the level of interrater agreement. By identifying and limiting the most important sources of disagreement, instrument reliability can be improved ultimately. (C) 2012 Elsevier Inc. All rights reserved.

引用

页码：1016 / 1025

页数：10

共 30 条

[1] [Anonymous], 1972, The dependability of behavioural measurements: Theory of generalizability for scores and profiles
[2] [Anonymous], 2008, HLTH MEASUREMENT SCA, DOI DOI 10.1093/ACPROF:OSO/9780199231881.001.0001
[3] [Anonymous], 2002, MULTILEVEL ANAL, DOI DOI 10.4324/9781410604118
[4] [Anonymous], 2003, Statistical Methods for Rates and Proportions
[5] [Anonymous], 2001, Generalizability Theory
[6] Bell BA, 2010, GLOB FOR 11 14 APR S
[7] 2X2 KAPPA-COEFFICIENTS - MEASURES OF AGREEMENT OR ASSOCIATION
BLOCH, DA
KRAEMER, HC
[J]. BIOMETRICS, 1989, 45 (01) : 269 - 287
[8] Broemeling L.D., 2009, Bayesian Methods for Measures of Agreement
[9] Bayesian analysis for inter-rater agreement
Broemeling, LD
[J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2001, 30 (03) : 437 - 446
[10] When to use agreement versus reliability measures
de Vet, Henrica C. W.
Terwee, Caroline B.
Knol, Dirk L.
Bouter, Lex M.
[J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 2006, 59 (10) : 1033 - 1039

← 1 2 3 →