Integration of dense subgraph finding with feature clustering for unsupervised feature selection

被引:48
作者
Bandyopadhyay, Sanghamitra [1 ]
Bhadra, Tapas [1 ]
Mitra, Pabitra [2 ]
Maulik, Ujjwal [3 ]
机构
[1] Indian Stat Inst, Machine Intelligence Unit, Kolkata 700108, India
[2] Indian Inst Technol, Dept Comp Sci & Engn, Kharagpur 721302, W Bengal, India
[3] Jadavpur Univ, Dept Comp Sci & Engn, Kolkata 700032, India
关键词
Pattern recognition; Unsupervised feature selection; Mutual information; Normalized mutual information; MUTUAL INFORMATION; MACHINE;
D O I
10.1016/j.patrec.2013.12.008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article a dense subgraph finding approach is adopted for the unsupervised feature selection problem. The feature set of a data is mapped to a graph representation with individual features constituting the vertex set and inter-feature mutual information denoting the edge weights. Feature selection is performed in a two-phase approach where the densest subgraph is first obtained so that the features are maximally non-redundant among each other. Finally, in the second stage, feature clustering around the non-redundant features is performed to produce the reduced feature set. An approximation algorithm is used for the densest subgraph finding. Empirically, the proposed approach is found to be competitive with several state of art unsupervised feature selection algorithms. (C) 2013 Elsevier B. V. All rights reserved.
引用
收藏
页码:104 / 112
页数:9
相关论文
共 26 条
  • [1] [Anonymous], 1991, ELEMENTS INFORM THEO, DOI [DOI 10.1002/0471200611, 10.1002/0471200611]
  • [2] Bache K., 2013, UCI Machine Learning Repository
  • [3] Densest Subgraph in Streaming and MapReduce
    Bahmani, Bahman
    Kumar, Ravi
    Vassilvitskii, Sergei
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (05): : 454 - 465
  • [4] USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING
    BATTITI, R
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04): : 537 - 550
  • [5] Biesiada J, 2007, ADV INTEL SOFT COMPU, V45, P242
  • [6] Cai D., 2010, P 16 ACM SIGKDD INT, P333, DOI DOI 10.1145/1835804.1835848
  • [7] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [8] Dash M., 1997, Intelligent Data Analysis, V1
  • [9] Normalized Mutual Information Feature Selection
    Estevez, Pablo. A.
    Tesmer, Michel
    Perez, Claudio A.
    Zurada, Jacek A.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2009, 20 (02): : 189 - 201
  • [10] Hall M., 2009, SIGKDD Explorations, V11, P10, DOI DOI 10.1145/1656274.1656278