A paired kappa to compare binary ratings across two medical tests

被引:1
作者
Nelson, Kerrie P. [1 ]
Edwards, Don [2 ]
机构
[1] Boston Univ, Dept Biostat, Boston, MA 02118 USA
[2] Univ South Carolina, Dept Stat, Columbia, SC 29208 USA
关键词
agreement; binary classifications; breast imaging; kappa; screening test; MAMMOGRAPHIC DENSITY; DIGITAL MAMMOGRAPHY; FILM MAMMOGRAPHY; AGREEMENT; ACCURACY; TOMOSYNTHESIS; RELIABILITY; COMBINATION; RATERS; MODELS;
D O I
10.1002/sim.8200
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Agreement between experts' ratings is an important prerequisite for an effective screening procedure. In clinical settings, large-scale studies are often conducted to compare the agreement of experts' ratings between new and existing medical tests, for example, digital versus film mammography. Challenges arise in these studies where many experts rate the same sample of patients undergoing two medical tests, leading to a complex correlation structure between experts' ratings. Here, we propose a novel paired kappa measure to compare the agreement between the binary ratings of many experts across two medical tests. Existing approaches can accommodate only a small number of experts, rely heavily on Cohen's kappa and Scott's pi measures of agreement, and thus are prone to their drawbacks. The proposed kappa appropriately accounts for correlations between ratings due to patient characteristics, corrects for agreement due to chance, and is robust to disease prevalence and other flaws inherent in the use of Cohen's kappa. It can be easily calculated in the software package R. In contrast to existing approaches, the proposed measure can flexibly incorporate large numbers of experts and patients by utilizing the generalized linear mixed models framework. It is intended to be used in population-based studies, increasing efficiency without increasing modeling complexity. Extensive simulation studies demonstrate low bias and excellent coverage probability of the proposed kappa under a broad range of conditions. Methods are applied to a recent nationwide breast cancer screening study comparing film mammography to digital mammography.
引用
收藏
页码:3272 / 3287
页数:16
相关论文
共 47 条
  • [1] Weighted least-squares approach for comparing correlated kappa
    Barnhart, HX
    Williamson, JM
    [J]. BIOMETRICS, 2002, 58 (04) : 1012 - 1019
  • [2] Association of volume and volume-independent factors with accuracy in screening mammogram interpretation
    Beam, CA
    Conant, EF
    Sickles, EA
    [J]. JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2003, 95 (04) : 282 - 290
  • [3] 2X2 KAPPA-COEFFICIENTS - MEASURES OF AGREEMENT OR ASSOCIATION
    BLOCH, DA
    KRAEMER, HC
    [J]. BIOMETRICS, 1989, 45 (01) : 269 - 287
  • [4] BIAS, PREVALENCE AND KAPPA
    BYRT, T
    BISHOP, J
    CARLIN, JB
    [J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 1993, 46 (05) : 423 - 429
  • [5] A COEFFICIENT OF AGREEMENT FOR NOMINAL SCALES
    COHEN, J
    [J]. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20 (01) : 37 - 46
  • [6] Estimating diagnostic accuracy without a gold standard: A continued controversy
    Collins, John
    Albert, Paul S.
    [J]. JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2016, 26 (06) : 1078 - 1082
  • [7] An interobserver reproducibility study on invasiveness of bladder cancer using virtual microscopy and heatmaps
    Comperat, Eva
    Egevad, Lars
    Lopez-Beltran, Antonio
    Camparo, Philippe
    Algaba, Ferran
    Amin, Mahul
    Epstein, Jonathan I.
    Hamberg, Hans
    Hulsbergen-van de Kaa, Christina
    Kristiansen, Glen
    Montironi, Rodolfo
    Pan, Chin-Chen
    Heloir, Fabrice
    Treurniet, Kilian
    Sykes, Jenna
    Van der Kwast, Theo H.
    [J]. HISTOPATHOLOGY, 2013, 63 (06) : 756 - 766
  • [8] Breast cancer screening using tomosynthesis in combination with digital mammography compared to digital mammography alone: a cohort study within the PROSPR consortium
    Conant, Emily F.
    Beaber, Elisabeth F.
    Sprague, Brian L.
    Herschorn, Sally D.
    Weaver, Donald L.
    Onega, Tracy
    Tosteson, Anna N. A.
    McCarthy, Anne Marie
    Poplack, Steven P.
    Haas, Jennifer S.
    Armstrong, Katrina
    Schnall, Mitchell D.
    Barlow, William E.
    [J]. BREAST CANCER RESEARCH AND TREATMENT, 2016, 156 (01) : 109 - 116
  • [9] Adjusted inference procedures for the interobserver agreement in twin studies
    Dixon, Stephanie N.
    Donner, Allan
    Shoukri, Mohamed M.
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2016, 25 (04) : 1260 - 1271
  • [10] Donner A, 2000, STAT MED, V19, P373, DOI 10.1002/(SICI)1097-0258(20000215)19:3<373::AID-SIM337>3.3.CO