Improving reliability of gene selection from microarray functional genomics data

被引：17

作者：

Fu, LM ^{[1
]}

Youn, ES

机构：

[1] Univ Florida, Gainesville, FL 32611 USA

[2] Pacific TB & Canc Res Org, Los Angeles, CA USA

[3] Univ Florida, Gainesville, FL 32611 USA

来源：

IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE | 2003年 / 7卷 / 03期

基金：

美国国家科学基金会;

关键词：

bootstrap; functional genomics; gene expression; gene selection; microarray; support vector machine (SVM);

D O I：

10.1109/TITB.2003.816558

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Constructing a classifier based on microarray gene expression data has recently emerged as an important problem for cancer classification. Recent results have suggested the feasibility of constructing such a classifier with reasonable predictive accuracy under the circumstance where only a small number of cancer tissue samples of known type are available. Difficulty arises from the fact that each sample contains the expression data of a vast, number of genes and these genes may interact with one another. Selection of a small number of critical genes is fundamental to correctly analyze the otherwise overwhelming data. It is essential to use a multivariate approach for capturing the correlated structure in the data. However, the curse of dimensionality leads to the concern about the reliability of selected genes. Here, we present a new gene selection method in which error and repeatability of selected genes are assessed within the context of M-fold cross-validation. In particular, we show that the method is able to identify source variables underlying data generation.

引用

页码：191 / 196

页数：6

共 25 条

[11] CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
[12] Cluster analysis and display of genome-wide expression patterns
Eisen, MB
Spellman, PT
Brown, PO
Botstein, D
[J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) : 14863 - 14868
[13] Fu L, 1994, NEURAL NETWORKS COMP
[14] Support vector machine classification and validation of cancer tissue samples using microarray expression data
Furey, TS
Cristianini, N
Duffy, N
Bednarski, DW
Schummer, M
Haussler, D
[J]. BIOINFORMATICS, 2000, 16 (10) : 906 - 914
[15] Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring
Golub, TR
Slonim, DK
Tamayo, P
Huard, C
Gaasenbeek, M
Mesirov, JP
Coller, H
Loh, ML
Downing, JR
Caligiuri, MA
Bloomfield, CD
Lander, ES
[J]. SCIENCE, 1999, 286 (5439) : 531 - 537
[16] Gene selection for cancer classification using support vector machines
Guyon, I
Weston, J
Barnhill, S
Vapnik, V
[J]. MACHINE LEARNING, 2002, 46 (1-3) : 389 - 422
[17] Clinical research comes under scrutiny
Habeck, M
[J]. LANCET ONCOLOGY, 2001, 2 (10) : 588 - 588
[18] Bootstrapping cluster analysis: Assessing the reliability of conclusions from microarray experiments
Kerr, MK
Churchill, GA
[J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (16) : 8961 - 8965
[19] Gene selection: a Bayesian variable selection approach
Lee, KE
Sha, NJ
Dougherty, ER
Vannucci, M
Mallick, BK
[J]. BIOINFORMATICS, 2003, 19 (01) : 90 - 97
[20] Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method
Li, LP
Weinberg, CR
Darden, TA
Pedersen, LG
[J]. BIOINFORMATICS, 2001, 17 (12) : 1131 - 1142

← 1 2 3 →