A novel feature selection approach based on clustering algorithm

被引:8
作者
Moslehi, Fateme [1 ]
Haeri, Abdorrahman [2 ]
机构
[1] Iran Univ Sci & Technol, Informat Technol Engn, Tehran, Iran
[2] Iran Univ Sci & Technol, Sch Ind Engn, Tehran, Iran
关键词
Data mining; clustering; K-means algorithm; feature selection; FEATURE SUBSET-SELECTION; GRAVITATIONAL SEARCH ALGORITHM; PARTICLE SWARM OPTIMIZATION; MUTUAL INFORMATION; CLASSIFICATION; HYBRID; REDUCTION;
D O I
10.1080/00949655.2020.1822358
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Clustering is one of the main methods of data mining. K-means algorithm is one of the most common clustering algorithms due to its efficiency and ease of use. In many data mining issues, the dataset contains a large number of fields and, therefore, the identification of the effective fields is an important issue. Appling the proposed algorithm, the important variables of the dataset would be identified. In the proposed method, the dataset is clustered in several stages and in each step the characteristics of the created clusters are examined and the features that transform the structure of clusters are introduced as effective features of the dataset. The proposed method was examined on 4 datasets and the results of this method were compared with other similar work and demonstrated that using this algorithm would eliminate redundant and unrelated features of the dataset and improve classification accuracy.
引用
收藏
页码:581 / 604
页数:24
相关论文
共 49 条
[1]  
Abasi Ammar Kamal, 2019, 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology (JEEIT). Proceedings, P1, DOI 10.1109/JEEIT.2019.8717491
[2]   A novel hybrid multi-verse optimizer with K-means for text documents clustering [J].
Abasi, Ammar Kamal ;
Khader, Ahamad Tajudin ;
Al-Betar, Mohammed Azmi ;
Naim, Syibrah ;
Alyasseri, Zaid Abdi Alkareem ;
Makhadmeh, Sharif Naser .
NEURAL COMPUTING & APPLICATIONS, 2020, 32 (23) :17703-17729
[3]   Feature Selection with β-Hill climbing Search for Text Clustering Application [J].
Abualigah, Laith Mohammad ;
Khader, Ahamad Tajudin ;
Al-Betar, Mohammed Azmi ;
Alyasseri, Zaid Abdi Alkareem ;
Alomari, Osama Ahmad ;
Hanandeh, Essam Said .
2017 PALESTINIAN INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (PICICT), 2017, :22-27
[4]  
Amezquita-Sanchez JP, 2015, SCI IRAN, V22, P1931
[5]   Application of binary quantum-inspired gravitational search algorithm in feature subset selection [J].
Barani, Fatemeh ;
Mirhosseini, Mina ;
Nezamabadi-pour, Hossein .
APPLIED INTELLIGENCE, 2017, 47 (02) :304-318
[6]   Hybrid of binary gravitational search algorithm and mutual information for feature selection in intrusion detection systems [J].
Bostani, Hamid ;
Sheikhan, Mansour .
SOFT COMPUTING, 2017, 21 (09) :2307-2324
[7]  
Boutsidis C., 2009, ADV NEURAL INFORM PR, V22, P153
[8]   Randomized Dimensionality Reduction for k-Means Clustering [J].
Boutsidis, Christos ;
Zouzias, Anastasios ;
Mahoney, Michael W. ;
Drineas, Petros .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2015, 61 (02) :1045-1062
[9]  
Caruana R, 2017, P 8 INT C MACH LEARN, P28
[10]   M-cluster and X-ray: Two methods for multi-jammer localization in wireless sensor networks [J].
Cheng, Tianzhen ;
Li, Ping ;
Zhu, Sencun ;
Torrieri, Don .
INTEGRATED COMPUTER-AIDED ENGINEERING, 2014, 21 (01) :19-34