Ensemble constrained Laplacian score for efficient and robust semi-supervised feature selection

被引:14
作者
Benabdeslem, Khalid [1 ]
Elghazel, Haytham [1 ]
Hindawi, Mohammed [2 ]
机构
[1] Univ Lyon 1, LIRIS, 43 Bd 11 Novembre 1918, F-69622 Villeurbanne, France
[2] Zirve Univ, Dept Comp Sci, Kizilhisar Campus, TR-27260 Gaziantep, Turkey
关键词
Feature selection; Semi-supervised context; Ensemble methods; Constraints; CLUSTERING ENSEMBLES; CONSENSUS; CANCER;
D O I
10.1007/s10115-015-0901-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose an efficient and robust approach for semi-supervised feature selection, based on the constrained Laplacian score. The main drawback of this method is the choice of the scant supervision information, represented by pairwise constraints. In fact, constraints are proven to have some noise which may deteriorate learning performance. In this work, we try to override any negative effects of constraint set by the variation of their sources. This is achieved by an ensemble technique using both a resampling of data (bagging) and a random subspace strategy. Experiments on high-dimensional datasets are provided for validating the proposed approach and comparing it with other representative feature selection methods.
引用
收藏
页码:1161 / 1185
页数:25
相关论文
共 41 条
  • [1] Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling
    Alizadeh, AA
    Eisen, MB
    Davis, RE
    Ma, C
    Lossos, IS
    Rosenwald, A
    Boldrick, JG
    Sabet, H
    Tran, T
    Yu, X
    Powell, JI
    Yang, LM
    Marti, GE
    Moore, T
    Hudson, J
    Lu, LS
    Lewis, DB
    Tibshirani, R
    Sherlock, G
    Chan, WC
    Greiner, TC
    Weisenburger, DD
    Armitage, JO
    Warnke, R
    Levy, R
    Wilson, W
    Grever, MR
    Byrd, JC
    Botstein, D
    Brown, PO
    Staudt, LM
    [J]. NATURE, 2000, 403 (6769) : 503 - 511
  • [2] [Anonymous], TR10007
  • [3] [Anonymous], 2006, Series Studies in Fuzziness and Soft Computing
  • [4] Barkia H., 2011, Proceedings of the 2011 IEEE 11th International Conference on Data Mining (ICDM 2011), P31, DOI 10.1109/ICDM.2011.129
  • [5] Efficient Semi-Supervised Feature Selection: Constraint, Relevance, and Redundancy
    Benabdeslem, Khalid
    Hindawi, Mohammed
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (05) : 1131 - 1143
  • [6] Benabdeslem K, 2011, LECT NOTES ARTIF INT, V6911, P204, DOI 10.1007/978-3-642-23780-5_23
  • [7] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [8] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [9] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [10] CORMEN TH, 2001, INTRO ALGORITHMS