Feature redundancy term variation for mutual information-based feature selection

被引：49

作者：

Gao, Wanfu ^{[1
,2
,3
]}

Hu, Liang ^{[1
,2
]}

Zhang, Ping ^{[1
,2
]}

机构：

[1] JiLin Univ, Coll Comp Sci & Technol, Changchun, Peoples R China

[2] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun, Peoples R China

[3] Jilin Univ, Coll Chem, Changchun, Peoples R China

来源：

APPLIED INTELLIGENCE | 2020年 / 50卷 / 04期

基金：

中国国家自然科学基金; 中国博士后科学基金;

关键词：

Machine learning; Feature selection; Information theory; Feature redundancy; RELEVANCE;

D O I：

10.1007/s10489-019-01597-z

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Feature selection plays a critical role in many applications that are relevant to machine learning, image processing and gene expression analysis. Traditional feature selection methods intend to maximize feature dependency while minimizing feature redundancy. In previous information-theoretical-based feature selection methods, feature redundancy term is measured by the mutual information between a candidate feature and each already-selected feature or the interaction information among a candidate feature, each already-selected feature and the class. However, the larger values of the traditional feature redundancy term do not indicate the worse a candidate feature because a candidate feature can obtain large redundant information, meanwhile offering large new classification information. To address this issue, we design a new feature redundancy term that considers the relevancy between a candidate feature and the class given each already-selected feature, and a novel feature selection method named min-redundancy and max-dependency (MRMD) is proposed. To verify the effectiveness of our method, MRMD is compared to eight competitive methods on an artificial example and fifteen real-world data sets respectively. The experimental results show that our method achieves the best classification performance with respect to multiple evaluation criteria.

引用

页码：1272 / 1288

页数：17

共 31 条

[1] USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING [J].