A new feature selection method to improve the document clustering using particle swarm optimization algorithm

被引:363
|
作者
Abualigah, Laith Mohammad [1 ]
Khader, Ahamad Tajudin [1 ]
Hanandeh, Essam Said [2 ]
机构
[1] Univ Sains Malaysia, Sch Comp Sci, George Town 11800, Malaysia
[2] Zarqa Univ, Dept Comp Informat Syst, POB 13132, Zarqa, Jordan
关键词
Unsupervised feature selection; Informative features; Particle swarm optimization algorithm; K-mean text clustering algorithm; DIMENSION REDUCTION;
D O I
10.1016/j.jocs.2017.07.018
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The large amount of text information on the Internet and in modern applications makes dealing with this volume of information complicated. The text clustering technique is an appropriate tool to deal with an enormous amount of text documents by grouping these documents into coherent groups. The document size decreases the effectiveness of the text clustering technique. Subsequently, text documents contain sparse and uninformative features (i.e., noisy, irrelevant, and unnecessary features), which affect the effectiveness of the text clustering technique. The feature selection technique is a primary unsupervised learning method employed to select the informative text features to create a new subset of a document's features. This method is used to increase the effectiveness of the underlying clustering algorithm. Recently, several complex optimization problems have been successfully solved using meta heuristic algorithms. This paper proposes a novel feature selection method, namely, feature selection method using the particle swarm optimization (PSO) algorithm (FSPSOTC) to solve the feature selection problem by creating a new subset of informative text features. This new subset of features can improve the performance of the text clustering technique and reduce the computational time. Experiments were conducted using six standard text datasets with several characteristics. These datasets are commonly used in the domain of the text clustering. The results revealed that the proposed method (FSPSOTC) enhanced the effectiveness of the text clustering technique by dealing with a new subset of informative features. The proposed method is compared with the other well-known algorithms i.e., feature selection method using a genetic algorithm to improve the text clustering (FSGATC), and feature selection method using the harmony search algorithm to improve the text clustering (FSHSTC) in the text feature selection. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:456 / 466
页数:11
相关论文
共 50 条
  • [1] Simultaneous Feature Selection and Clustering Using Particle Swarm Optimization
    Swetha, K. P.
    Devi, V. Susheela
    NEURAL INFORMATION PROCESSING, ICONIP 2012, PT I, 2012, 7663 : 509 - 515
  • [2] Text document clustering using Spectral Clustering algorithm with Particle Swarm Optimization
    Janani, R.
    Vijayarani, S.
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 134 : 192 - 200
  • [3] Document clustering using Particle Swarm Optimization
    Cui, XH
    Potok, TE
    Palathingal, P
    2005 IEEE SWARM INTELLIGENCE SYMPOSIUM, 2005, : 185 - 191
  • [4] Feature Subset Selection for Clustering using Binary Particle Swarm Optimization
    Dastider, Surjodoy Ghosh
    Kashyap, Himanshu
    Mandal, Shashwata
    Ghosh, Abhinandan
    Goswami, Saptarsi
    2015 14TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY (ICIT 2015), 2015, : 159 - 164
  • [5] Feature selection with clustering probabilistic particle swarm optimization
    Gao, Jinrui
    Wang, Ziqian
    Lei, Zhenyu
    Wang, Rong-Long
    Wu, Zhengwei
    Gao, Shangce
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (09) : 3599 - 3617
  • [6] An Efficient Feature Selection Method Using Hybrid Particle Swarm Optimization with Genetic Algorithm
    Narayanan, Arya
    Praveen, A. N.
    INTERNATIONAL CONFERENCE ON INTELLIGENT DATA COMMUNICATION TECHNOLOGIES AND INTERNET OF THINGS, ICICI 2018, 2019, 26 : 1148 - 1155
  • [7] The feature selection method for SVM with discrete particle swarm optimization algorithm
    Peng Xiyuan
    Wu Hongxing
    Peng Yu
    ISTM/2007: 7TH INTERNATIONAL SYMPOSIUM ON TEST AND MEASUREMENT, VOLS 1-7, CONFERENCE PROCEEDINGS, 2007, : 2523 - 2526
  • [8] A New Particle Swarm Optimization Algorithm for Clustering
    Xu, Xiangping
    Li, Jun
    2018 IEEE 14TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2018, : 768 - 773
  • [9] Using selection to improve particle swarm optimization
    Angeline, PJ
    1998 IEEE INTERNATIONAL CONFERENCE ON EVOLUTIONARY COMPUTATION - PROCEEDINGS, 1998, : 84 - 89
  • [10] NEW APPROACHES TO CLUSTERING DATA Using the Particle Swarm Optimization Algorithm
    Abdalla Esmin, Ahmed Ali
    Pereira, Dilson Lucas
    ICEIS 2008: PROCEEDINGS OF THE TENTH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL AIDSS: ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS, 2008, : 593 - 597