A new feature selection method to improve the document clustering using particle swarm optimization algorithm

被引:363
|
作者
Abualigah, Laith Mohammad [1 ]
Khader, Ahamad Tajudin [1 ]
Hanandeh, Essam Said [2 ]
机构
[1] Univ Sains Malaysia, Sch Comp Sci, George Town 11800, Malaysia
[2] Zarqa Univ, Dept Comp Informat Syst, POB 13132, Zarqa, Jordan
关键词
Unsupervised feature selection; Informative features; Particle swarm optimization algorithm; K-mean text clustering algorithm; DIMENSION REDUCTION;
D O I
10.1016/j.jocs.2017.07.018
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The large amount of text information on the Internet and in modern applications makes dealing with this volume of information complicated. The text clustering technique is an appropriate tool to deal with an enormous amount of text documents by grouping these documents into coherent groups. The document size decreases the effectiveness of the text clustering technique. Subsequently, text documents contain sparse and uninformative features (i.e., noisy, irrelevant, and unnecessary features), which affect the effectiveness of the text clustering technique. The feature selection technique is a primary unsupervised learning method employed to select the informative text features to create a new subset of a document's features. This method is used to increase the effectiveness of the underlying clustering algorithm. Recently, several complex optimization problems have been successfully solved using meta heuristic algorithms. This paper proposes a novel feature selection method, namely, feature selection method using the particle swarm optimization (PSO) algorithm (FSPSOTC) to solve the feature selection problem by creating a new subset of informative text features. This new subset of features can improve the performance of the text clustering technique and reduce the computational time. Experiments were conducted using six standard text datasets with several characteristics. These datasets are commonly used in the domain of the text clustering. The results revealed that the proposed method (FSPSOTC) enhanced the effectiveness of the text clustering technique by dealing with a new subset of informative features. The proposed method is compared with the other well-known algorithms i.e., feature selection method using a genetic algorithm to improve the text clustering (FSGATC), and feature selection method using the harmony search algorithm to improve the text clustering (FSHSTC) in the text feature selection. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:456 / 466
页数:11
相关论文
共 50 条
  • [31] An Entropy Driven Multiobjective Particle Swarm Optimization Algorithm for Feature Selection
    Luo, Juanjuan
    Zhou, Dongqing
    Jiang, Lingling
    Ma, Huadong
    2021 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC 2021), 2021, : 768 - 775
  • [32] Feature Selection Based on Hybridization of Genetic Algorithm and Particle Swarm Optimization
    Ghamisi, Pedram
    Benediktsson, Jon Atli
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2015, 12 (02) : 309 - 313
  • [33] Hybrid particle swarm optimization algorithm for text feature selection problems
    Mourad Nachaoui
    Issam Lakouam
    Imad Hafidi
    Neural Computing and Applications, 2024, 36 : 7471 - 7489
  • [34] A Method of Feature Selection based on Particle Swarm Optimization Algorithm with Trans-gene Operator
    Deng Ruifen
    Liu Binghan
    Xia Tian
    Wang Weizhi
    2008 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-11, 2008, : 3568 - +
  • [35] FEATURE SELECTION USING PARTICLE SWARM OPTIMIZATION IN TEXT CATEGORIZATION
    Aghdam, Mehdi Hosseinzadeh
    Heidari, Setareh
    JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2015, 5 (04) : 231 - 238
  • [36] Feature Selection Using Particle Swarm Optimization in Intrusion Detection
    Ahmad, Iftikhar
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2015,
  • [37] A New Hybrid Approach for Document Clustering Using Tabu Search and Particle Swarm Optimization (TSPSO)
    Haribabu, T.
    Jayaprada, S.
    PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION TECHNOLOGIES, IC3T 2015, VOL 3, 2016, 381 : 609 - 617
  • [38] Data Clustering Using Particle Swarm Optimization and Bee Algorithm
    Dhote, C. A.
    Thakare, Anuradha D.
    Chaudhari, Shruti M.
    2013 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATIONS AND NETWORKING TECHNOLOGIES (ICCCNT), 2013,
  • [39] Optimum feature selection using new ternary particle swarm optimization in two phases
    Agarwal, Shikha
    Ranjan, Prabhat
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2017, 33 (04) : 2095 - 2107
  • [40] A novel feature selection using Markov blanket representative set and Particle Swarm Optimization algorithm
    Sun, Liqin
    Yang, Youlong
    Ning, Tong
    COMPUTATIONAL & APPLIED MATHEMATICS, 2023, 42 (02):