A Hybrid Approach for Biomarker Discovery from Microarray Gene Expression Data for Cancer Classification

被引:0
作者
Peng, Yanxiong [1 ,2 ]
Li, Wenyuan [1 ,2 ]
Liu, Ying [1 ,2 ,3 ]
机构
[1] Univ Texas Dallas, Lab Bioinformat & Med Informat, Richardson, TX 75083 USA
[2] Univ Texas Dallas, Dept Comp Sci, POB 830688, Richardson, TX 75083 USA
[3] Univ Texas Dallas, Dept Mol & Cell Biol, Richardson, TX 75083 USA
关键词
Biomarker discovery; Gene expression; Cancer classification; Microarray; Gene selection;
D O I
暂无
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Microarrays allow researchers to monitor the gene expression patterns for tens of thousands of genes across a wide range of cellular responses, phenotype and conditions. Selecting a small subset of discriminate genes from thousands of genes is important for accurate classification of diseases and phenotypes. Many methods have been proposed to find subsets of genes with maximum relevance and minimum redundancy, which can distinguish accurately between samples with different labels. To find the minimum subset of relevant genes is often referred as biomarker discovery. Two main approaches, filter and wrapper techniques, have been applied to biomarker discovery. In this paper, we conducted a comparative study of different biomarker discovery methods, including six filter methods and three wrapper methods. We then proposed a hybrid approach, FR-Wrapper, for biomarker discovery. The aim of this approach is to find an optimum balance between the precision of the biomarker discovery and the computation cost, by taking advantages of both filter method's efficiency and wrapper method's high accuracy. Our hybrid approach applies Fisher's ratio, a simple method easy to understand and implement, to filter out most of the irrelevant genes, then a wrapper method is employed to reduce the redundancy. The performance of FR-Wrapper approach is evaluated over four widely used microarray datasets. Analysis of experimental results reveals that the hybrid approach can achieve the goal of maximum relevance with minimum redundancy.
引用
收藏
页码:301 / 311
页数:11
相关论文
共 28 条
[1]   Mining functional information associated with expression arrays [J].
Blaschke C. ;
Oliveros J.C. ;
Valencia A. .
Functional & Integrative Genomics, 2001, 1 (4) :256-268
[2]   Selection of relevant features and examples in machine learning [J].
Blum, AL ;
Langley, P .
ARTIFICIAL INTELLIGENCE, 1997, 97 (1-2) :245-271
[3]  
Chai H, 2004, P 2 EUR WORKSH DAT M
[4]   Biomarker discovery in microarray gene expression data with Gaussian processes [J].
Chu, W ;
Ghahramani, Z ;
Falciani, F ;
Wild, DL .
BIOINFORMATICS, 2005, 21 (16) :3385-3393
[5]  
Cruz JA, 2006, CANCER INFORM, V2, P59
[6]  
Das S., 2001, P 18 INT C MACH LEAR, V1, P74
[7]   Gene selection and classification of microarray data using random forest -: art. no. 3 [J].
Díaz-Uriarte, R ;
de Andrés, SA .
BMC BIOINFORMATICS, 2006, 7 (1)
[8]   Minimum redundancy feature selection from microarray gene expression data [J].
Ding, C ;
Peng, HC .
PROCEEDINGS OF THE 2003 IEEE BIOINFORMATICS CONFERENCE, 2003, :523-528
[9]  
Donoho DL, 2000, MATH CHALLENGES 21 C
[10]   Comparison of discrimination methods for the classification of tumors using gene expression data [J].
Dudoit, S ;
Fridlyand, J ;
Speed, TP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (457) :77-87