A new feature selection using dynamic interaction

被引：5

作者：

Li, Zhang ^{[1
,2
]}

机构：

[1] Jiangsu Univ Technol, Sch Comp Engn, Changzhou 213001, Jiangsu, Peoples R China

[2] Beijing Univ Posts & Telecommun, Minist Educ, Key Lab Trustworthy Distributed Comp & Serv, Beijing 100876, Peoples R China

来源：

PATTERN ANALYSIS AND APPLICATIONS | 2021年 / 24卷 / 01期

关键词：

Feature selection; Feature interaction; Feature relevance; Feature redundancy; Filter method; MUTUAL INFORMATION; RELEVANCE; CLASSIFICATION;

D O I：

10.1007/s10044-020-00916-2

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the continuous development of Internet technology, data gradually present a complicated and high-dimensional trend. These high-dimensional data have a large number of redundant features and irrelevant features, which bring great challenges to the existing machine learning algorithms. Feature selection is one of the important research topics in the fields of machine learning, pattern recognition and data mining, and it is also an important means in the data preprocessing stage. Feature selection is to look for the optimal feature subset from the original feature set, which would improve the classification accuracy and reduce the machine learning time. The traditional feature selection algorithm tends to ignore the kind of feature which has a weak distinguishing capacity as a monomer, whereas the feature group's distinguishing capacity is strong. Therefore, a new dynamic interaction feature selection (DIFS) algorithm is proposed in this paper. Initially, under the theoretical framework of interactive information, it redefines the relevance, irrelevance and redundancy of the features. Secondly, it offers the computational formulas for calculating interactive information. Finally, under the eleven data sets of UCI and three different classifiers, namely, KNN, SVM and C4.5, the DIFS algorithm increases the classification accuracy of the FullSet by 3.2848% and averagely decreases the number of features selected by 15.137. Hence, the DIFS algorithm can not only identify the relevance feature effectively, but also identify the irrelevant and redundant features. Moreover, it can effectively improve the classification accuracy of the data sets and reduce the feature dimensions of the data sets.

引用

页码：203 / 215

页数：13

共 27 条

[1] Feature selection using Joint Mutual Information Maximisation [J].

Bennasar, Mohamed ;

Hicks, Yulia ;

Setchi, Rossitza .

EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (22) :8520-8532

[2] Fast wrapper feature subset selection in high-dimensional datasets by means of filter re-ranking [J].

Bermejo, Pablo ;

de la Ossa, Luis ;

Gamez, Jose A. ;

Puerta, Jose M. .

KNOWLEDGE-BASED SYSTEMS, 2012, 25 (01) :35-44

[3] Feature selection in machine learning: A new perspective [J].

Cai, Jie ;

Luo, Jiawei ;

Wang, Shulin ;

Yang, Sheng .

NEUROCOMPUTING, 2018, 300 :70-79

[4] ur-CAIM: improved CAIM discretization for unbalanced and balanced data [J].

Cano, Alberto ;

Nguyen, Dat T. ;

Ventura, Sebastian ;

Cios, Krzysztof J. .

SOFT COMPUTING, 2016, 20 (01) :173-188

[5] An instance voting approach to feature selection [J].

Chamakura, Lily ;

Saha, Goutam .

INFORMATION SCIENCES, 2019, 504 :449-469

[6] Maximum relevance minimum common redundancy feature selection for nonlinear data [J].

Che, Jinxing ;

Yang, Youlong ;

Li, Li ;

Bai, Xuying ;

Zhang, Shenghu ;

Deng, Chengzhi .

INFORMATION SCIENCES, 2017, 409 :68-86

[7] A novel low-rank hypergraph feature selection for multi-view classification [J].

Cheng, Xiaohui ;

Zhu, Yonghua ;

Song, Jingkuan ;

Wen, Guoqiu ;

He, Wei .

NEUROCOMPUTING, 2017, 253 :115-121

[8] Information-theoretic feature selection for functional data classification [J].

Gomez-Verdejo, Vanessa ;

Verleysen, Michel ;

Fleury, Jerome .

NEUROCOMPUTING, 2009, 72 (16-18) :3580-3589

[9] Feature selection considering two types of feature relevancy and feature interdependency [J].

Hu, Liang ;

Gao, Wanfu ;

Zhao, Kuo ;

Zhang, Ping ;

Wang, Feng .

EXPERT SYSTEMS WITH APPLICATIONS, 2018, 93 :423-434

[10] Fast multi-label feature selection based on information-theoretic feature ranking [J].

Lee, Jaesung ;

Kim, Dae-Won .

PATTERN RECOGNITION, 2015, 48 (09) :2761-2771

← 1 2 3 →