An efficient gene selection technique for cancer recognition based on neighborhood mutual information

被引:70
作者
Hu, Qinghua [1 ]
Pan, Wei [1 ]
An, Shuang [1 ]
Ma, Peijun [1 ]
Wei, Jinmao [2 ]
机构
[1] Harbin Inst Technol, Harbin 150006, Peoples R China
[2] Nankai Univ, Tianjin 300071, Peoples R China
基金
中国国家自然科学基金;
关键词
Cancer recognition; Gene selection; Neighborhood mutual information; Maximal relevancy; Minimal redundancy; SUBSET-SELECTION; CLASSIFICATION; PREDICTION; IDENTIFICATION; MICROARRAY; RELEVANCE; SCHEME;
D O I
10.1007/s13042-010-0008-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Gene selection is a key problem in gene expression based cancer recognition and related tasks. A measure, called neighborhood mutual information (NMI), is introduced to evaluate the relevance between genes and related decision in this work. Then the measure is combined with the search strategy of minimal redundancy and maximal relevancy (mRMR) for constructing a NMI based mRMR gene selection algorithm (NMI_mRMR). In addition, it is also found that the first k best genes with respect to NMI are usually enough for cancer classification. We can just perform mRMR on these genes and remove the rest in the preprocessing step, which will lead to reduction of computational time. Based on this observation, an efficient gene selection algorithm, denoted by NMI_EmRMR, is proposed. Several cancer recognition tasks are gathered for testing the proposed technique. The experimental results show NMI_EmRMR is effective and efficient.
引用
收藏
页码:63 / 74
页数:12
相关论文
共 39 条
  • [11] FAYYAD UM, 1993, IJCAI-93, VOLS 1 AND 2, P1022
  • [12] LIGHT-DIRECTED, SPATIALLY ADDRESSABLE PARALLEL CHEMICAL SYNTHESIS
    FODOR, SPA
    READ, JL
    PIRRUNG, MC
    STRYER, L
    LU, AT
    SOLAS, D
    [J]. SCIENCE, 1991, 251 (4995) : 767 - 773
  • [13] Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring
    Golub, TR
    Slonim, DK
    Tamayo, P
    Huard, C
    Gaasenbeek, M
    Mesirov, JP
    Coller, H
    Loh, ML
    Downing, JR
    Caligiuri, MA
    Bloomfield, CD
    Lander, ES
    [J]. SCIENCE, 1999, 286 (5439) : 531 - 537
  • [14] Gene selection for cancer classification using support vector machines
    Guyon, I
    Weston, J
    Barnhill, S
    Vapnik, V
    [J]. MACHINE LEARNING, 2002, 46 (1-3) : 389 - 422
  • [15] Hall M. A., 2000, P 17 INT C MACH LEAR, P359, DOI DOI 10.5555/645529.657793
  • [16] Selection of DNA markers
    Hoogeboom, Hendrik Jan
    Kosters, Walter A.
    Laros, Jeroen F. J.
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2008, 38 (01): : 26 - 32
  • [17] Information-preserving hybrid data reduction based on fuzzy-rough techniques
    Hu, QH
    Yu, DR
    Xie, ZX
    [J]. PATTERN RECOGNITION LETTERS, 2006, 27 (05) : 414 - 423
  • [18] Neighborhood rough set based heterogeneous feature subset selection
    Hu, Qinghua
    Yu, Daren
    Liu, Jinfu
    Wu, Congxin
    [J]. INFORMATION SCIENCES, 2008, 178 (18) : 3577 - 3594
  • [19] Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks
    Khan, J
    Wei, JS
    Ringnér, M
    Saal, LH
    Ladanyi, M
    Westermann, F
    Berthold, F
    Schwab, M
    Antonescu, CR
    Peterson, C
    Meltzer, PS
    [J]. NATURE MEDICINE, 2001, 7 (06) : 673 - 679
  • [20] Input feature selection by mutual information based on Parzen window
    Kwak, N
    Choi, CH
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (12) : 1667 - 1671