Sparse Graph Embedding Unsupervised Feature Selection

被引:90
作者
Wang, Shiping [1 ]
Zhu, William [2 ]
机构
[1] Fuzhou Univ, Fujian Prov Key Lab Network Comp & Intelligent In, Fuzhou 350116, Fujian, Peoples R China
[2] Univ Elect Sci & Technol China, Inst Fundamental & Frontier Sci, Chengdu 610054, Sichuan, Peoples R China
来源
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS | 2018年 / 48卷 / 03期
基金
中国国家自然科学基金;
关键词
Feature selection; machine learning; nonnegative matrix factorization; sparse coding; unsupervised learning; MUTUAL INFORMATION; RELEVANCE; RANKING;
D O I
10.1109/TSMC.2016.2605132
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
High dimensionality is quite commonly encountered in data mining problems, and hence dimensionality reduction becomes an important task in order to improve the efficiency of learning algorithms. As a widely used technique of dimensionality reduction, feature selection is about selecting a feature subset being guided by certain criterion. In this paper, three unsupervised feature selection algorithms are proposed and addressed from the viewpoint of sparse graph embedding learning. First, using the self-characterization of the given data, we view the data themselves as a dictionary, conduct sparse coding and propose the sparsity preserving feature selection (SPFS) algorithm. Second, considering the locality preservation of neighborhoods for the data, we study a special case of the SPFS problem, namely, neighborhood preserving feature selection problem, and come up with a suitable algorithm. Third, we incorporate sparse coding and feature selection into one unified framework, and propose a neighborhood embedding feature selection (NEFS) criterion. Drawing support from nonnegative matrix factorization, the corresponding algorithm for NEFS is presented and its convergence is proved. Finally, the three proposed algorithms are validated with the use of eight publicly available real-world datasets from machine learning repository. Extensive experimental results demonstrate the superiority of the proposed algorithms over four compared state-of-the-art unsupervised feature selection methods.
引用
收藏
页码:329 / 341
页数:13
相关论文
共 45 条
[1]  
[Anonymous], 1990, Entropy and Information Theory
[2]  
[Anonymous], 2011, IJCAI INT JOINT C AR
[3]  
[Anonymous], 2013, IJCAI
[4]   Graph Regularized Nonnegative Matrix Factorization for Data Representation [J].
Cai, Deng ;
He, Xiaofei ;
Han, Jiawei ;
Huang, Thomas S. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (08) :1548-1560
[5]   Convex and Semi-Nonnegative Matrix Factorizations [J].
Ding, Chris ;
Li, Tao ;
Jordan, Michael I. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (01) :45-55
[6]   Efficient greedy feature selection for unsupervised learning [J].
Farahat, Ahmed K. ;
Ghodsi, Ali ;
Kamel, Mohamed S. .
KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 35 (02) :285-310
[7]   Research on collaborative negotiation for e-commerce. [J].
Feng, YQ ;
Lei, Y ;
Li, Y ;
Cao, RZ .
2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, :2085-2088
[8]   Feature-Selected Tree-Based Classification [J].
Freeman, Cecille ;
Kulic, Dana ;
Basir, Otman .
IEEE TRANSACTIONS ON CYBERNETICS, 2013, 43 (06) :1990-2004
[9]  
Gu Q., 2012, INT C ARTIFICIAL INT, P477
[10]  
Gu Q., 2011, Graphics Interface 2011, P1