Unsupervised feature selection via joint local learning and group sparse regression

被引：0

作者：

Yue WU ^{[1
,2
]}

Can WANG ^{[1
,2
]}

Yueqing ZHANG ^{[1
]}

Jiajun BU ^{[1
,2
]}

机构：

[1] Zhejiang Provincial Key Laboratory of Service Robot,College of Computer Science and Technology, Zhejiang University

[2] Alibaba-Zhejiang University Joint Institute of Frontier Technologies

来源：

FrontiersofInformationTechnology&ElectronicEngineering | 2019年 / 20卷 / 04期

关键词：

Unsupervised; Local learning; Group sparse regression; Feature selection;

D O I：

暂无

中图分类号：

TP181 [自动推理、机器学习];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Feature selection has attracted a great deal of interest over the past decades. By selecting meaningful feature subsets, the performance of learning algorithms can be effectively improved. Because label information is expensive to obtain, unsupervised feature selection methods are more widely used than the supervised ones. The key to unsupervised feature selection is to find features that effectively reflect the underlying data distribution. However,due to the inevitable redundancies and noise in a dataset, the intrinsic data distribution is not best revealed when using all features. To address this issue, we propose a novel unsupervised feature selection algorithm via joint local learning and group sparse regression(JLLGSR). JLLGSR incorporates local learning based clustering with group sparsity regularized regression in a single formulation, and seeks features that respect both the manifold structure and group sparse structure in the data space. An iterative optimization method is developed in which the weights finally converge on the important features and the selected features are able to improve the clustering results. Experiments on multiple real-world datasets(images, voices, and web pages) demonstrate the effectiveness of JLLGSR.

引用

页码：538 / 553

页数：16

共 45 条

[1] Regression shrinkage and selection via the Lasso [J].

Tibshirani, R .

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1996, 58 (01) :267-288

[2]

Local kernel regression score for selecting features of high-dimensional data. Cheung, Yiu-Ming,Zeng, Hong. IEEE Transactions on Knowledge and Data Engineering . 2009

[3]

Gene Selection for Cancer Classification using Support Vector Machines[J] . Isabelle Guyon,Jason Weston,Stephen Barnhill,Vladimir Vapnik. &nbspMachine Learning . 2002 (1)

[4]

Regularization and variable selection via the elastic net[J] . HuiZou,TrevorHastie. &nbspJournal of the Royal Statistical Society: Series B (Statistical Methodology) . 2005 (2)

[5]

Orthogonal locality minimizing globality maximizing projections for feature extraction. Nie, Feiping,Xiang, Shiming,Song, Yangqiu,Zhang, Changshui. Optical Engineering . 2009

[6]

Mutual information-based feature selection for multilabel classification[J] . Gauthier Doquire,Michel Verleysen. &nbspNeurocomputing . 2013

[7]

Learning sparse SVMfor feature selection on very high dimensional datasets. Tan MK,Wang L,Tsang IW. 27thInt Conf on Machine Learning . 2010

[8]

Laplacian score for feature selection. He X F,Cai D,Niyogi P. Neural information processing systems . 2005

[9]

The hungarian method for the assignment problem. Kuhn H W. Naval Research Logistics . 1955

[10]

Feature selection for local learning based clustering. Zeng H,Cheung YM. 13thPacific-Asia Conf on Advances in Knowledge Discovery and Data Mining . 2009

← 1 2 3 4 5 →