A Comprehensive Review of Feature Selection and Feature Selection Stability in Machine Learning

被引:22
作者
Buyukkececi, Mustafa [1 ]
Okur, Mehmet Cudi [2 ]
机构
[1] Univerlist, Izmir, Turkiye
[2] Yasar Univ, Fac Engn, Dept Software Engn, Izmir, Turkiye
来源
GAZI UNIVERSITY JOURNAL OF SCIENCE | 2023年 / 36卷 / 04期
关键词
Feature selection; Dimensionality reduction; Types of feature selection; Feature selection stability; Stability measures; MICROARRAY; ALGORITHMS; BIAS;
D O I
10.35378/gujs.993763
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Feature selection is a dimension reduction technique used to select features that are relevant to machine learning tasks. Reducing the dataset size by eliminating redundant and irrelevant features plays a pivotal role in increasing the performance of machine learning algorithms, speeding up the learning process, and building simple models. The apparent need for feature selection has aroused considerable interest amongst researchers and has caused feature selection to find a wide range of application domains including text mining, pattern recognition, cybersecurity, bioinformatics, and big data. As a result, over the years, a substantial amount of literature has been published on feature selection and a wide variety of feature selection methods have been proposed. The quality of feature selection algorithms is measured not only by evaluating the quality of the models built using the features they select, or by the clustering tendencies of the features they select, but also by their stability. Therefore, this study focused on feature selection and feature selection stability. In the pages that follow, general concepts and methods of feature selection, feature selection stability, stability measures, and reasons and solutions for instability are discussed.
引用
收藏
页码:1506 / 1520
页数:15
相关论文
共 76 条
[1]  
Alazab A., 2012, 2012 International Symposium on Communications and Information Technologies (ISCIT), P296, DOI 10.1109/ISCIT.2012.6380910
[2]  
Alelyani S., 2011, IEEE INT C HIGH PERF, P701
[3]  
Alelyani S., 2013, Ph.D. Thesis, P10
[4]   The Effect of the Characteristics of the Dataset on the Selection Stability [J].
Alelyani, Salem ;
Liu, Huan ;
Wang, Lei .
2011 23RD IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2011), 2011, :970-977
[5]   Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection [J].
Ang, Jun Chin ;
Mirzal, Andri ;
Haron, Habibollah ;
Hamed, Haza Nuzly Abdull .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (05) :971-989
[6]  
[Anonymous], 2016, JOINT EUROPEAN C MAC
[7]   Particle Swarm Optimization based Two-Stage Feature Selection in Text Mining [J].
Bai, Xiaohan ;
Gao, Xiaoying ;
Xue, Bing .
2018 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2018, :989-996
[8]  
Boutsidis Christos, 2008, P 14 ACM SIGKDD INT, P61, DOI [DOI 10.1145/1401890.1401903, 10.1145/1401890.1401903]
[9]  
Breiman L., 1984, Classification and regression trees, V18-55, P216
[10]  
Cho S.B., 2003, CRPITS 19 P OFTHE 1, P189