Identification and Analysis of Cancer Diagnosis Using Probabilistic Classification Vector Machines with Feature Selection

被引:21
作者
Du, Xiuquan [1 ,2 ]
Li, Xinrui [2 ]
Li, Wen [2 ]
Yan, Yuanting [1 ,2 ]
Zhang, Yanping [1 ,2 ]
机构
[1] Anhui Univ, Key Lab Intelligent Comp & Signal Proc, Minist Educ, Hefei, Anhui, Peoples R China
[2] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Anhui, Peoples R China
基金
美国国家科学基金会;
关键词
Probabilistic classification vector; feature selection; tumor classification; DX; machine learning; kernel function; GENE; PREDICTION;
D O I
10.2174/1574893612666170405125637
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The accurate classification of tumors types is mainly important for the treatment of cancer. With the progress of the microarray expression profile, many methods are proposed to deal with these data. However, because of the feature dimension of tumor gene expression profile is very high; many machine learning algorithms are failure. Objective & Methods: In this paper, a novel method named probabilistic classification vector machines (PCVM) with feature selection is proposed for tumor types detection using gene expression data, PCVM adopt a signed and truncated Gaussian prior to solve the problem of unstable solutions caused, and the complexity of the model can be controlled by the truncated Gaussian prior. The performance of PCVM is evaluated on two datasets by using four metrics. Results: This method achieves 84.21% accuracy and 95.24 % accuracy in the leukemia and prostrate dataset respectively. As compared to other methods, PCVM obtain much higher performance than Support Vector Machines (SVM), Naive Bayes (NB), RBF Neural Networks (RBF), K-nearest Neighbor (KNN), and Random Forest (RF) except SVM on Prostate dataset. In order to reduce computational time, we adopt a feature selection method (DX) to rank the features and search the optimal feature combination based on PCVM, PCVM with DX method (PCVM-DX) achieves 94.74% accuracy, 100% sensitivity, 85.71% specificity and 92.31% precision on the leukemia dataset. PCVM-DX method obtained the same result as PCVM on the prostate dataset. We also compare DX with other feature selection method; the result reveals that the PCVM-DX is efficient for tumor classification in terms of performance. Conclusion: PCVM-DX is observed to be better than the other methods in two data sets. The novelty of this approach lies in applying PCVM to tackle the same prior for different classes may lead to unstable solutions by RVMs and also exploring the important feature subset in the microarray expression profile with feature selection.
引用
收藏
页码:625 / 632
页数:8
相关论文
共 30 条
[1]   Hybrid Framework Using Multiple-Filters and an Embedded Approach for an Efficient Selection and Classification of Microarray Data [J].
Bonilla-Huerta, Edmundo ;
Hernandez-Montiel, Alberto ;
Morales-Caporal, Roberto ;
Arjona-Lopez, Marco .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (01) :12-26
[2]   Efficient Probabilistic Classification Vector Machine With Incremental Basis Function Selection [J].
Chen, Huanhuan ;
Tino, Peter ;
Yao, Xin .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (02) :356-369
[3]   Probabilistic Classification Vector Machines [J].
Chen, Huanhuan ;
Tino, Peter ;
Yao, Xin .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2009, 20 (06) :901-914
[4]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[5]   A Novel Feature Extraction Scheme with Ensemble Coding for Protein-Protein Interaction Prediction [J].
Du, Xiuquan ;
Cheng, Jiaxing ;
Zheng, Tingting ;
Duan, Zheng ;
Qian, Fulan .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2014, 15 (07) :12731-12749
[6]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[7]   Independent component analysis-based penalized discriminant method for tumor classification using gene expression data [J].
Huang, De-Shuang ;
Zheng, Chun-Hou .
BIOINFORMATICS, 2006, 22 (15) :1855-1862
[8]   Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks [J].
Khan, J ;
Wei, JS ;
Ringnér, M ;
Saal, LH ;
Ladanyi, M ;
Westermann, F ;
Berthold, F ;
Schwab, M ;
Antonescu, CR ;
Peterson, C ;
Meltzer, PS .
NATURE MEDICINE, 2001, 7 (06) :673-679
[9]   Gene Selection Using Locality Sensitive Laplacian Score [J].
Liao, Bo ;
Jiang, Yan ;
Liang, Wei ;
Zhu, Wen ;
Cai, Lijun ;
Cao, Zhi .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2014, 11 (06) :1146-1156
[10]   LibD3C: Ensemble classifiers with a clustering and dynamic selection strategy [J].
Lin, Chen ;
Chen, Wenqiang ;
Qiu, Cheng ;
Wu, Yunfeng ;
Krishnan, Sridhar ;
Zou, Quan .
NEUROCOMPUTING, 2014, 123 :424-435