Bagging Constraint Score for feature selection with pairwise constraints

被引:46
作者
Sun, Dan [1 ]
Zhang, Daoqiang [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Dept Comp Sci & Engn, Nanjing 210016, Peoples R China
基金
美国国家科学基金会;
关键词
Feature selection; Constraint Score; Pairwise constraints; Bagging; Ensemble learning; IMAGE RETRIEVAL; RELEVANCE; CLASSIFICATION; PREDICTION; FRAMEWORK; SVM;
D O I
10.1016/j.patcog.2009.12.011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Constraint Score is a recently proposed method for feature selection by using pairwise constraints which specify whether a pair of instances belongs to the same class or not. It has been shown that the Constraint Score, with only a small amount of pairwise constraints, achieves comparable performance to those fully supervised feature selection methods such as Fisher Score. However, one major disadvantage of the Constraint Score is that its performance is dependent on a good selection on the composition and cardinality of constraint set, which is very challenging in practice. In this work, we address the problem by importing Bagging into Constraint Score and a new method called Bagging Constraint Score (BCS) is proposed. Instead of seeking one appropriate constraint set for single Constraint Score, in BCS we perform multiple Constraint Score, each of which uses a bootstrapped subset of original given constraint set. Diversity analysis on individuals of ensemble shows that resampling pairwise constraints is helpful for simultaneously improving accuracy and diversity of individuals. We conduct extensive experiments on a series of high-dimensional datasets from UCI repository and gene databases, and the experimental results validate the effectiveness of the proposed method. (C) 2009 Elsevier Ltd. All rights reserved.
引用
收藏
页码:2106 / 2118
页数:13
相关论文
共 51 条
  • [1] ALMUALLIM H, 1991, PROCEEDINGS : NINTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 AND 2, P547
  • [2] Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays
    Alon, U
    Barkai, N
    Notterman, DA
    Gish, K
    Ybarra, S
    Mack, D
    Levine, AJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) : 6745 - 6750
  • [3] Predicting protein structural class by SVM with class-wise optimized features and decision probabilities
    Anand, Ashish
    Pugalenthi, Ganesan
    Suganthan, P. N.
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 2008, 253 (02) : 375 - 380
  • [4] [Anonymous], 2006, Computer Science, University of Wisconsin-Madison
  • [5] [Anonymous], P 10 EUR C PRINC PRA
  • [6] [Anonymous], 2008, P 23 AAAI C ART INT
  • [7] Bar-Hillel AB, 2005, J MACH LEARN RES, V6, P937
  • [8] Basu S, 2004, SIAM PROC S, P333
  • [9] Blake C. L., 1998, Uci repository of machine learning databases
  • [10] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32