Unsupervised and Supervised Feature Selection for Incomplete Data via L2,1-Norm and Reconstruction Error Minimization

被引:1
作者
Cai, Jun [1 ]
Fan, Linge [1 ]
Xu, Xin [1 ]
Wu, Xinrong [1 ]
机构
[1] Army Engn Univ PLA, Coll Commun Engn, Nanjing 210007, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 17期
基金
中国国家自然科学基金;
关键词
unsupervised feature selection; supervised feature selection; incomplete data; L-2; L-1; norm; reconstruction error; MISSING DATA; IMPUTATION;
D O I
10.3390/app12178752
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Feature selection has been widely used in machine learning and data mining since it can alleviate the burden of the so-called curse of dimensionality of high-dimensional data. However, in previous works, researchers have designed feature selection methods with the assumption that all the information from a data set can be observed. In this paper, we propose unsupervised and supervised feature selection methods for use with incomplete data, further introducing an L-2,L-1 norm and a reconstruction error minimization method. Specifically, the proposed feature selection objective functions take advantage of an indicator matrix reflecting unobserved information in incomplete data sets, and we present pairwise constraints, minimizing the L-2,L-1-norm-robust loss functionand performing error reconstruction simultaneously. Furthermore, we derive two alternative iterative algorithms to effectively optimize the proposed objective functions and the convergence of the proposed algorithms is proven theoretically. Extensive experimental studies were performed on both real and synthetic incomplete data sets to demonstrate the performance of the proposed methods.
引用
收藏
页数:21
相关论文
共 38 条
[21]   Missing data imputation by K nearest neighbours based on grey relational structure and mutual information [J].
Pan, Ruilin ;
Yang, Tingsheng ;
Cao, Jianhua ;
Lu, Ke ;
Zhang, Zhanchao .
APPLIED INTELLIGENCE, 2015, 43 (03) :614-632
[22]   Feature Selection Embedded Subspace Clustering [J].
Peng, Chong ;
Kang, Zhao ;
Yang, Ming ;
Cheng, Qiang .
IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (07) :1018-1022
[23]  
Peng HY, 2017, AAAI CONF ARTIF INTE, P2471
[24]   13 WAYS TO LOOK AT THE CORRELATION-COEFFICIENT [J].
RODGERS, JL ;
NICEWANDER, WA .
AMERICAN STATISTICIAN, 1988, 42 (01) :59-66
[25]   Half-Quadratic Minimization for Unsupervised Feature Selection on Incomplete Data [J].
Shen, Heng Tao ;
Zhu, Yonghua ;
Zheng, Wei ;
Zhu, Xiaofeng .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (07) :3122-3135
[26]   Multi-criteria feature selection on cost-sensitive data with missing values [J].
Shu, Wenhao ;
Shen, Hong .
PATTERN RECOGNITION, 2016, 51 :268-280
[28]   Generalized RBF kernel for incomplete data [J].
Smieja, Marek ;
Struski, Lukasz ;
Tabor, Jacek ;
Marzec, Mateusz .
KNOWLEDGE-BASED SYSTEMS, 2019, 173 :150-162
[29]   A review of unsupervised feature selection methods [J].
Solorio-Fernandez, Saul ;
Carrasco-Ochoa, J. Ariel ;
Martinez-Trinidad, Jose Fco. .
ARTIFICIAL INTELLIGENCE REVIEW, 2020, 53 (02) :907-948
[30]   MissForest-non-parametric missing value imputation for mixed-type data [J].
Stekhoven, Daniel J. ;
Buehlmann, Peter .
BIOINFORMATICS, 2012, 28 (01) :112-118