Evaluating Agreement: Conducting a Reliability Study

被引:136
作者
Karanicolas, Paul J.
Bhandari, Mohit
Kreder, Hans
Moroni, Antonio
Richardson, Martin
Walter, Stephen D.
Norman, Geoff R.
Guyatt, Gordon H.
机构
[1] Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, ON L8L 2X2
[2] Department of Surgery, University of Toronto, Sunnybrook Campus, MG365, Toronto, ON M4N 3M5
[3] Rizzoli Orthopaedic Institute, University of Bologna, Bologna 40136
[4] Department of Orthopaedic Surgery, Royal Melbourne Hospital, University of Melbourne, Melbourne, VIC, Grattan Street
基金
加拿大健康研究院;
关键词
SAMPLE-SIZE; KAPPA; REPRODUCIBILITY; CLASSIFICATIONS; EQUALITY; SCALE;
D O I
10.2106/JBJS.H.01624
中图分类号
R826.8 [整形外科学]; R782.2 [口腔颌面部整形外科学]; R726.2 [小儿整形外科学]; R62 [整形外科学(修复外科学)];
学科分类号
摘要
Instruments that are useful in clinical or research practice will, when the object of measurement is stable, yield similar results when applied at different times, in different situations, or by different users. Studies that measure the relation of differences between patients or subjects and measurement error (reliability studies) are becoming increasingly common in the orthopaedic literature. In this paper, we identify common aspects of reliability studies and suggest features that improve the reader's confidence in the results. One concept serves as the foundation for all further consideration: in order for a reliability study to be relevant, the patients, raters, and test administration in the study must be similar to the clinical or research context in which the instrument will be used. We introduce the statistical measures that readers will most commonly encounter in reliability studies, and we suggest an approach to sample-size estimation. Readers interested in critically appraising reliability studies or in developing their own reliability studies may find this review helpful.
引用
收藏
页码:99 / 106
页数:8
相关论文
共 35 条
[1]  
Altman DG., 1990, PRACTICAL STAT MED R, P403
[2]  
[Anonymous], 1925, STAT METHODS RES WOR
[3]   How reliable are reliability studies of fracture classifications?: A systematic review of their methodologies [J].
Audigé, L ;
Bhandari, M ;
Kellam, J .
ACTA ORTHOPAEDICA SCANDINAVICA, 2004, 75 (02) :184-194
[4]   Wikis, blogs and podcasts: A new generation of Web-based tools for virtual collaborative clinical practice and education [J].
Kamel Boulos M.N. ;
Maramba I. ;
Wheeler S. .
BMC Medical Education, 6 (1)
[5]   Ankle fracture classification: A comparison of reliability of three X-ray views versus two [J].
Braga, ME ;
Rockett, M ;
Vraney, R ;
Anderson, R ;
Toledano, A .
FOOT & ANKLE INTERNATIONAL, 1998, 19 (08) :555-562
[6]   The internet as a tool in clinical pharmacology [J].
Castel, Josep-Maria ;
Figueras, Albert ;
Vigo, Joan-Miquel .
BRITISH JOURNAL OF CLINICAL PHARMACOLOGY, 2006, 61 (06) :787-790
[8]   A COEFFICIENT OF AGREEMENT FOR NOMINAL SCALES [J].
COHEN, J .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20 (01) :37-46
[9]  
Cronbach LJ, 1951, PSYCHOMETRIKA, V16, P297
[10]   Testing the equality of dependent intraclass correlation coefficients [J].
Donner, A ;
Zou, GY .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES D-THE STATISTICIAN, 2002, 51 :367-379