Relevant feature selection and ensemble classifier design using bi-objective genetic algorithm

被引：12

作者：

Das, Asit Kumar ^{[1
]}

Pati, Soumen Kumar ^{[2
]}

Ghosh, Arka ^{[1
]}

机构：

[1] Indian Inst Engn Sci & Technol, Comp Sci & Technol, Howrah 711103, W Bengal, India

[2] Maulana Abul Kalam Azad Univ Technol, Bioinformat, Nadia 741249, W Bengal, India

来源：

KNOWLEDGE AND INFORMATION SYSTEMS | 2020年 / 62卷 / 02期

关键词：

Feature selection; Cellular automata; Lower bound approximation; Kullback-Leibler divergence; Bi-objective genetic algorithm; Ensemble classifier; ROUGH SET; SYSTEMS;

D O I：

10.1007/s10115-019-01341-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the era of digital boom, single classifier cannot perform well in various datasets. Ensemble classifier aims to bridge this performance gap by combining multiple classifiers of diverse characteristics to get better generalization. But classifier selection highly depends on the dataset, and its efficiency degrades tremendously due to the presence of irrelevant features. Feature selection aids the performance of classifier by removing those irrelevant features. Initially, we have proposed a bi-objective genetic algorithm-based feature selection method (FSBOGA), where nonlinear, uniform, hybrid cellular automata are used to generate an initial population. Objective functions are defined using lower bound approximation of rough set theory and Kullback-Leibler divergence method of information theory to select unambiguous and informative features. The replacement strategy for creation of next-generation population is based on the Pareto optimal solution with respect to both the objective functions. Next, a novel bi-objective genetic algorithm-based ensemble classification method (CCBOGA) is devised to ensemble the individual classifiers designed using obtained reduced datasets. It is observed that the constructed ensemble classifier performs better than the individual classifiers. The performances of proposed FSBOGA and CCBOGA are investigated on some popular datasets and compared with the state-of-the-art algorithms to demonstrate their effectiveness.

引用

页码：423 / 455

页数：33

共 55 条

[1] Anensemble method for data stream classification in the presence of concept drift [J].

Abbaszadeh, Omid ;

Amiri, Ali ;

Khanteymoori, Ali Reza .

FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2015, 16 (12) :1059-1068

[2]

Acharyya A., 2013, International Journal of Computer Science Issues, V10, P422

[3]

[Anonymous], 1997, ICML

[4]

[Anonymous], 1998, MACH LEARN

[5]

[Anonymous], 2008, IEEE Intell Inf Bull

[6]

[Anonymous], 1996, INT C MACH LEARN

[7]

Bache K., 2013, UCI MACHINE LEARNING, P901

[8] Integration of dense subgraph finding with feature clustering for unsupervised feature selection [J].

Bandyopadhyay, Sanghamitra ;

Bhadra, Tapas ;

Mitra, Pabitra ;

Maulik, Ujjwal .

PATTERN RECOGNITION LETTERS, 2014, 40 :104-112

[9]

Bernstein EJ, 2005, PROC CVPR IEEE, P734

[10]

Cai D., 2010, P 16 ACM SIGKDD INT, P333

← 1 2 3 4 5 6 →