Evaluating technologies for classification and prediction in medicine

被引:38
作者
Pepe, MS
机构
[1] Fred Hutchinson Canc Res Ctr, Seattle, WA 98109 USA
[2] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
关键词
diagnostic test; receiver operating characteristic; odds ratio; disease screening; prognosis;
D O I
10.1002/sim.2431
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Modern technologies promise to provide new ways of diagnosing disease, detecting subclinical disease, predicting prognosis, selecting patient specific treatment, identifying subjects at risk for disease, and so forth. Advances in genomics, proteomics and imaging modalities in particular hold great potential for assisting with classification/prediction in medicine. Before a classifier can be adopted for routine use in health care, its classification accuracy must be determined. Standards for evaluating new clinical classifiers however, lag far behind the well established standards that exist for evaluating new clinical treatments. In this paper, we discuss a phased approach to developing a new classifier (or biomarker). It mirrors the internationally established phase 1-2-3 paradigm for therapeutic drugs. The defined phases lead to a logical sequence of studies for classifier development. We emphasize that evaluating classification accuracy is fundamentally different from simply establishing association with outcome. Therefore, study objectives and designs differ from the familiar methods of clinical trials. We discuss these briefly for each phase. Finally, we argue that classifier development requires some rethinking of traditional data analysis techniques. As an example we show that maximizing the likelihood function to fit a logistic regression model to multiple predictors, can yield a poor classifier. Instead we demonstrate that an approach that maximizes an alternative objective function characterizing classification accuracy performs better. Copyright (c) 2005 John Wiley & Sons, Ltd.
引用
收藏
页码:3687 / 3696
页数:10
相关论文
共 31 条
  • [1] [Anonymous], 1999, STAT MED, V18, P1905
  • [2] The central role of receiver operating characteristic (ROC) curves in evaluating tests for the early detection of cancer
    Baker, SG
    [J]. JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2003, 95 (07) : 511 - 515
  • [3] Markers for early detection of cancer: Statistical guidelines for nestedcase-control studies
    Baker S.G.
    Kramer B.S.
    Srivastava S.
    [J]. BMC Medical Research Methodology, 2 (1) : 1 - 8
  • [4] ADVANCES IN STATISTICAL METHODOLOGY FOR DIAGNOSTIC MEDICINE IN THE 1980S
    BEGG, CB
    [J]. STATISTICS IN MEDICINE, 1991, 10 (12) : 1887 - 1895
  • [5] Towards complete and accurate reporting of studies of diagnostic accuracy: The STARD initiative
    Bossuyt, PM
    Reitsma, JB
    Bruns, DE
    Gatsonis, CA
    Glasziou, PP
    Irwig, LM
    Lijmer, JG
    Moher, D
    Rennie, D
    de Vet, HCW
    [J]. CLINICAL CHEMISTRY, 2003, 49 (01) : 1 - 6
  • [6] Towards complete and accurate reporting of studies of diagnostic accuracy: The STARD initiative
    Bossuyt, PM
    Reitsma, JB
    Bruns, DE
    Gatsonis, CA
    Glasziou, PP
    Irwig, LM
    Lijmer, JG
    Moher, D
    Rennie, D
    de Vet, HCW
    [J]. ANNALS OF INTERNAL MEDICINE, 2003, 138 (01) : 40 - 44
  • [7] Emir B, 1998, STAT MED, V17, P2563, DOI 10.1002/(SICI)1097-0258(19981130)17:22<2563::AID-SIM952>3.3.CO
  • [8] 2-F
  • [9] Incorporating the time dimension in receiver operating characteristic curves: A case study of prostate cancer
    Etzioni, R
    Pepe, M
    Longton, G
    Hu, CC
    Goodman, G
    [J]. MEDICAL DECISION MAKING, 1999, 19 (03) : 242 - 251
  • [10] Misguided efforts and future challenges for research on "diagnostic tests"
    Feinstein, AR
    [J]. JOURNAL OF EPIDEMIOLOGY AND COMMUNITY HEALTH, 2002, 56 (05) : 330 - 332