ROC curves in Clinical chemistry:: Uses, misuses, and possible solutions

被引:277
作者
Obuchowski, NA
Lieber, ML
Wians, FH
机构
[1] Cleveland Clin Fdn, Dept Biostat, Cleveland, OH 44195 USA
[2] Cleveland Clin Fdn, Dept Epidemiol, Cleveland, OH 44195 USA
[3] Cleveland Clin Fdn, Div Radiol Wb4, Cleveland, OH 44195 USA
[4] Univ Texas, SW Med Ctr, Dept Pathol, Dallas, TX USA
关键词
D O I
10.1373/clinchem.2004.031823
中图分类号
R446 [实验室诊断]; R-33 [实验医学、医学实验];
学科分类号
1001 ;
摘要
Background: ROC curves have become the standard for describing and comparing the accuracy of diagnostic tests. Not surprisingly, ROC curves are used often by clinical chemists. Our aims were to observe how the accuracy of clinical laboratory diagnostic tests is assessed, compared, and reported in the literature; to identify common problems with the use of ROC curves; and to offer some possible solutions. Methods: We reviewed every original work using ROC curves and published in Clinical Chemistry in 2001 or 2002. For each article we recorded phase of the research, prospective or retrospective design, sample size, presence/absence of confidence intervals (CIs), nature of the statistical analysis, and major analysis problems. Results: Of 58 articles, 31% were phase I (exploratory), 50% were phase II. (challenge), and 19% were phase III (advanced) studies. The studies increased in sample size from phase I to III and showed a progression in the use of prospective designs. Most phase I studies were powered to assess diagnostic tests with ROC areas greater than or equal to0.70. Thirty-eight percent of studies failed to include CIs for diagnostic test accuracy or the CIs were constructed inappropriately. Thirty-three percent of studies provided insufficient analysis for comparing diagnostic tests. Other problems included dichotomization of the gold standard scale and inappropriate analysis of the equivalence of two diagnostic tests. Conclusion: We identify available software and make some suggestions for sample size determination, testing for equivalence in diagnostic accuracy, and alternatives to a dichotomous classification of a continuous-scale gold standard. More methodologic research is needed in areas specific to clinical chemistry. (C) 2004 American Association for Clinical Chemistry.
引用
收藏
页码:1118 / 1125
页数:8
相关论文
共 33 条
[1]  
[Anonymous], SPSS 10 0
[2]   PROVING THE NULL HYPOTHESIS IN CLINICAL-TRIALS [J].
BLACKWELDER, WC .
CONTROLLED CLINICAL TRIALS, 1982, 3 (04) :345-353
[3]  
Bruns DE, 1997, CLIN CHEM, V43, P2211
[4]  
Bruns DE, 2000, CLIN CHEM, V46, P893
[5]   COMPARING THE AREAS UNDER 2 OR MORE CORRELATED RECEIVER OPERATING CHARACTERISTIC CURVES - A NONPARAMETRIC APPROACH [J].
DELONG, ER ;
DELONG, DM ;
CLARKEPEARSON, DI .
BIOMETRICS, 1988, 44 (03) :837-845
[6]  
DWYER AJ, 1997, RADIOLOGY, V202, P621
[7]  
EFRON B, 1993, MONOGRAPHS STAT APPL, V57, P75
[8]   Do we need a checklist for reporting the results of diagnostic test evaluations? The STARD proposal [J].
Gatsonis, C .
ACADEMIC RADIOLOGY, 2003, 10 (06) :599-600
[9]  
Gold MR, 1996, COST EFFECTIVENESS H
[10]   EVALUATING THE YIELD OF MEDICAL TESTS [J].
HARRELL, FE ;
CALIFF, RM ;
PRYOR, DB ;
LEE, KL ;
ROSATI, RA .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 1982, 247 (18) :2543-2546