A centroid-based gene selection method for microarray data classification

被引:23
作者
Guo, Shun [1 ,2 ]
Guo, Donghui [1 ]
Chen, Lifei [3 ]
Jiang, Qingshan [2 ]
机构
[1] Xiamen Univ, Dept Elect Engn, Xiamen 361005, Fujian, Peoples R China
[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518000, Peoples R China
[3] Fujian Normal Univ, Sch Math & Comp Sci, Fuzhou 350117, Fujian, Peoples R China
基金
高等学校博士学科点专项科研基金; 中国国家自然科学基金;
关键词
Class centroid; Microarray data; Classification; L1; regularization; Gene selection; DISCRIMINANT-ANALYSIS; ALGORITHMS; EFFICIENT;
D O I
10.1016/j.jtbi.2016.03.034
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
For classification problems based on microarray data, the data typically contains a large number of irrelevant and redundant features. In this paper, a new gene selection method is proposed to choose the best subset of features for microarray data with the irrelevant and redundant features removed. We formulate the selection problem as a L1-regularized optimization problem, based on a newly defined linear discriminant analysis criterion. Instead of calculating the mean of the samples, a kernel-based approach is used to estimate the class centroid to define both the between-class separability and the within-class compactness for the criterion. Theoretical analysis indicates that the global optimal solution of the L1-regularized criterion can be reached with a general condition, on which an efficient algorithm is derived to the feature selection problem in a linear time complexity with respect to the number of features and the number of samples. The experimental results on ten publicly available microarray datasets demonstrate that the proposed method performs effectively and competitively compared with state-of-the-art methods. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:32 / 41
页数:10
相关论文
共 49 条
  • [1] [Anonymous], P 24 INT C MACH LEAR
  • [2] [Anonymous], 2007, Multi-Task Feature Learning, DOI DOI 10.7551/MITPRESS/7503.003.0010
  • [3] Atkeson CG, 1997, ARTIF INTELL REV, V11, P11, DOI 10.1023/A:1006559212014
  • [4] A review of microarray datasets and applied feature selection methods
    Bolon-Canedo, V.
    Sanchez-Marono, N.
    Alonso-Betanzos, A.
    Benitez, J. M.
    Herrera, F.
    [J]. INFORMATION SCIENCES, 2014, 282 : 111 - 135
  • [5] Cai Y., 2010, P 10 SIAM INT C DAT
  • [6] A survey on feature selection methods
    Chandrashekar, Girish
    Sahin, Ferat
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (01) : 16 - 28
  • [7] CLASS OF COMPUTATIONALLY EFFICIENT FEATURE SELECTION CRITERIA
    CHEN, CH
    [J]. PATTERN RECOGNITION, 1975, 7 (1-2) : 87 - 94
  • [8] A nonlinear conjugate gradient method with a strong global convergence property
    Dai, YH
    Yuan, Y
    [J]. SIAM JOURNAL ON OPTIMIZATION, 1999, 10 (01) : 177 - 182
  • [9] Duan KB, 2007, LECT NOTES COMPUT SC, V4447, P47
  • [10] Bias and stability of single variable classifiers for feature ranking and selection
    Fakhraei, Shobeir
    Soltanian-Zadeh, Hamid
    Fotouhi, Farshad
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (15) : 6945 - 6958