Iterative constraint score based on hypothesis margin for semi-supervised feature selection

被引:4
作者
Chen, Xinyi
Zhang, Li [1 ]
Zhao, Lei
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Peoples R China
关键词
Semi-supervised learning; Feature selection; Constraint score; Hypothesis margin; CLASSIFICATION; PREDICTION; RELEVANCE;
D O I
10.1016/j.knosys.2023.110577
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To remove redundant features and avoid the curse of dimensionality, the most important features should be selected for downstream tasks, including semi-supervised learning. Several semi-supervised constraint scores using pairwise constraints have been proposed to estimate the relevance of features. However, these methods evaluate the features individually and ignore the correlations between them. Thus, we propose a semi-supervised feature selection method called the iterative constraint score based on the hypothesis margin (HM-ICS), which uses forward sequential selection to select an optimal feature subset with a good ability to maintain the constraint structure of the data and distinguish samples that belong to different classes. HM-ICS iteratively modifies the classical constraint score method to measure the relevance between features and maintain the constraint structure of the data. By introducing the hypothesis margin, HM-ICS can ensure strong discriminative power of the optimal feature subset. Extensive experiments were conducted on nine UCI and five high-dimensional datasets, and the experimental results confirmed that HM-ICS can achieve better performance than state-of-the-art supervised and semi-supervised methods. (c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:15
相关论文
共 40 条
[1]  
Alelyani S, 2014, CH CRC DATA MIN KNOW, P29
[2]  
Asuncion A., 2007, Uci machine learning repository
[3]   Ensemble constrained Laplacian score for efficient and robust semi-supervised feature selection [J].
Benabdeslem, Khalid ;
Elghazel, Haytham ;
Hindawi, Mohammed .
KNOWLEDGE AND INFORMATION SYSTEMS, 2016, 49 (03) :1161-1185
[4]  
Benabdeslem K, 2011, LECT NOTES ARTIF INT, V6911, P204, DOI 10.1007/978-3-642-23780-5_23
[5]   Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses [J].
Bhattacharjee, A ;
Richards, WG ;
Staunton, J ;
Li, C ;
Monti, S ;
Vasa, P ;
Ladd, C ;
Beheshti, J ;
Bueno, R ;
Gillette, M ;
Loda, M ;
Weber, G ;
Mark, EJ ;
Lander, ES ;
Wong, W ;
Johnson, BE ;
Golub, TR ;
Sugarbaker, DJ ;
Meyerson, M .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (24) :13790-13795
[6]  
Bishop C. M., 1995, Neural networks for pattern recognition
[7]   Feature selection in machine learning: A new perspective [J].
Cai, Jie ;
Luo, Jiawei ;
Wang, Shulin ;
Yang, Sheng .
NEUROCOMPUTING, 2018, 300 :70-79
[8]   A survey on feature selection methods [J].
Chandrashekar, Girish ;
Sahin, Ferat .
COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (01) :16-28
[9]   Predictive Ensemble Pruning by Expectation Propagation [J].
Chen, Huanhuan ;
Tino, Peter ;
Yao, Xin .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (07) :999-1013
[10]  
Chung F. R., 1997, Spectral Graph Theory, V92