Estimating classification probabilities in high-dimensional diagnostic studies

被引：6

作者：

Appel, Inka J. ^{[1
]}

Gronwald, Wolfram ^{[1
]}

Spang, Rainer ^{[1
]}

机构：

[1] Univ Regensburg, Inst Funct Genom, D-93053 Regensburg, Germany

来源：

BIOINFORMATICS | 2011年 / 27卷 / 18期

关键词：

GENE; CANCER; DISEASE;

D O I：

10.1093/bioinformatics/btr434

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Motivation: Classification algorithms for high-dimensional biological data like gene expression profiles or metabolomic fingerprints are typically evaluated by the number of misclassifications across a test dataset. However, to judge the classification of a single case in the context of clinical diagnosis, we need to assess the uncertainties associated with that individual case rather than the average accuracy across many cases. Reliability of individual classifications can be expressed in terms of class probabilities. While classification algorithms are a well-developed area of research, the estimation of class probabilities is considerably less progressed in biology, with only a few classification algorithms that provide estimated class probabilities. Results: We compared several probability estimators in the context of classification of metabolomics profiles. Evaluation criteria included sparseness biases, calibration of the estimator, the variance of the estimator and its performance in identifying highly reliable classifications. We observed that several of them display artifacts that compromise their use in practice. Classification probabilities based on a combination of local cross-validation error rates and monotone regression prove superior in metabolomic profiling.

引用

页码：2563 / 2570

页数：8

共 50 条

[41] CONSISTENT SCREENING PROCEDURES IN HIGH-DIMENSIONAL BINARY CLASSIFICATION
Jiang, Hangjin
Zhao, Xingqiu
Ma, Ronald C. W.
Fan, Xiaodan
STATISTICA SINICA, 2022, 32 (01) : 109 - 130
[42] Simultaneous Feature Selection and Classification for High-Dimensional Data
Pai, Vriddhi
Gupta, Subhash Chand
PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON GREEN COMPUTING AND INTERNET OF THINGS (ICGCIOT 2018), 2018, : 153 - 158
[43] Statistical Sparse Independence Rule for High-dimensional Classification
Wang, Liping
Ji, Changtai
Xie, Shanggao
Zhang, Qi
2016 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE WORKSHOPS (WIW 2016), 2016, : 50 - 53
[44] Sure feature screening for high-dimensional dichotomous classification
SHAO Li
YU Yuan
ZHOU Yong
ScienceChina(Mathematics), 2016, 59 (12) : 2527 - 2542
[45] Hybrid Classification of High-Dimensional Biomedical Tumour Datasets
Byczkowska-Lipinska, Liliana
Wosiak, Agnieszka
ADVANCED AND INTELLIGENT COMPUTATIONS IN DIAGNOSIS AND CONTROL, 2016, 386 : 287 - 298
[46] Representation and classification of high-dimensional biomedical spectral data
Pedrycz, W.
Lee, D. J.
Pizzi, N. J.
PATTERN ANALYSIS AND APPLICATIONS, 2010, 13 (04) : 423 - 436
[47] HIGH-DIMENSIONAL ASYMPTOTICS OF PREDICTION: RIDGE REGRESSION AND CLASSIFICATION
Dobriban, Edgar
Wager, Stefan
ANNALS OF STATISTICS, 2018, 46 (01): : 247 - 279
[48] New algorithms for efficient high-dimensional nonparametric classification
Computer Science Department, Carnegie Mellon University, Pittsburgh, PA 15213, United States
J. Mach. Learn. Res., 2006, (1135-1158):
[49] Simplified estimating functions for diffusion models with a high-dimensional parameter
Bibby, BM
Sorensen, M
SCANDINAVIAN JOURNAL OF STATISTICS, 2001, 28 (01) : 99 - 112
[50] Fault classification for high-dimensional data streams: A directional diagnostic framework based on multiple hypothesis testing
Xiang, Dongdong
Li, Wendong
Tsung, Fugee
Pu, Xiaolong
Kang, Yicheng
NAVAL RESEARCH LOGISTICS, 2021, 68 (07) : 973 - 987

← 1 2 3 4 5 →