Multi-step dimensionality reduction and semi-supervised graph-based tumor classification using gene expression data

被引:35
作者
Gui, Jie [1 ,2 ]
Wang, Shu-Lin [1 ,3 ]
Lei, Ying-Ke [1 ,2 ,4 ]
机构
[1] Chinese Acad Sci, Intelligent Comp Lab, Hefei Inst Intelligent Machines, Hefei 230031, Anhui, Peoples R China
[2] Univ Sci & Technol China, Dept Automat, Hefei 230026, Anhui, Peoples R China
[3] Hunan Univ, Sch Comp & Commun, Changsha 410082, Hunan, Peoples R China
[4] Inst Elect Engn, Hefei 230037, Anhui, Peoples R China
基金
中国博士后科学基金; 美国国家科学基金会;
关键词
Multi-step dimensionality reduction; Gene ranking; Discrete cosine transform; Principal component analysis; Semi-supervised learning; Microarray data analysis; Tumor diagnosis; MOLECULAR CLASSIFICATION; SAMPLE CLASSIFICATION; CANCER; MICROARRAY; SELECTION; PREDICTION; DIAGNOSIS;
D O I
10.1016/j.artmed.2010.05.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Objective Both supervised methods and unsupervised methods have been widely used to solve the tumor classification problem based on gene expression profiles This paper introduces a semi-supervised graph-based method for tumor classification Feature extraction plays a key role in tumor classification based on gene expression profiles and can greatly improve the performance of a classifier In this paper we propose a novel multi-step dimensionality reduction method for extracting tumor-related features Methods and materials First the Wilcoxon rank-sum test is used for gene selection Then gene ranking and discrete cosine transform are combined with principal component analysis for feature extraction Finally the performance is evaluated by semi-supervised learning algorithms Results To show the validity of the proposed method we apply it to classify four tumor datasets involving various human normal and tumor tissue samples The experimental results show that the proposed method is efficient and feasible Compared with other methods our method can achieve relatively higher prediction accuracy Particularly it is found that semi-supervised method is superior to support vector machines in classification performance Conclusions The proposed approach can effectively improve the performance of tumor classification based on gene expression profiles This work is a meaningful attempt to explore and apply multi-step dimensionality reduction and semi-supervised learning methods in the field of tumor classification Considering the high classification accuracy there should be much room for the application of multi-step dimensionality reduction and semi-supervised learning methods to perform tumor classification (C) 2010 Elsevier B V All rights reserved
引用
收藏
页码:181 / 191
页数:11
相关论文
共 53 条
[1]   DISCRETE COSINE TRANSFORM [J].
AHMED, N ;
NATARAJAN, T ;
RAO, KR .
IEEE TRANSACTIONS ON COMPUTERS, 1974, C 23 (01) :90-93
[2]   Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[3]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[4]  
[Anonymous], 2006, BOOK REV IEEE T NEUR
[5]   Bayesian applications of belief networks and multilayer perceptrons for ovarian tumor classification with rejection [J].
Antal, P ;
Fannes, G ;
Timmerman, D ;
Moreau, Y ;
De Moor, B .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2003, 29 (1-2) :39-60
[6]   New algorithms for multi-class cancer diagnosis using tumor gene expression signatures [J].
Bagirov, AM ;
Ferguson, B ;
Ivkovic, S ;
Saunders, G ;
Yearwood, J .
BIOINFORMATICS, 2003, 19 (14) :1800-1807
[7]   Semi-supervised learning on Riemannian manifolds [J].
Belkin, M ;
Niyogi, P .
MACHINE LEARNING, 2004, 56 (1-3) :209-239
[8]   Randomized maps for assessing the reliability of patients clusters in DNA microarray data analyses [J].
Bertoni, Alberto ;
Valentini, Giorgio .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2006, 37 (02) :85-109
[9]   Semi-supervised graph-based hyperspectral image classification [J].
Camps-Valls, Gustavo ;
Bandos, Tatyana V. ;
Zhou, Dengyong .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2007, 45 (10) :3044-3054
[10]  
Chartrand G., 2004, Introduction to graph theory