Unsupervised Feature Selection Technique Based on Harmony Search Algorithm for Improving the Text Clustering

被引:0
作者
Abualigah, Laith Mohammad [1 ]
Khader, Ahamad Tajudin [1 ]
Al-Betar, Mohammed Azmi [2 ]
机构
[1] USM, Sch Comp Sci, George Town 11800, Malaysia
[2] Al Huson Univ Coll, Dept Informat Technol, Irbid, Jordan
来源
2016 7TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (CSIT) | 2016年
关键词
Unsupervised Feature Selection; Harmony Search Algorithm; K-mean Text Clustering; Informative features; Sparse features; DIMENSION REDUCTION;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The increasing amount of text information on the Internet web pages affects the clustering analysis. The text clustering is a favorable analysis technique used for partitioning a massive amount of information into clusters. Hence, the major problem that affects the text clustering technique is the presence uninformative and sparse features in text documents. The feature selection (FS) is an important unsupervised technique used to eliminate uninformative features to encourage the text clustering technique. Recently, the meta-heuristic algorithms are successfully applied to solve several optimization problems. In this paper, we proposed the harmony search (HS) algorithm to solve the feature selection problem (FSHSTC). The proposed method is used to enhance the text clustering (TC) technique by obtaining a new subset of informative or useful features. Experiments were applied using four benchmark text datasets. The results show that the proposed FSHSTC is improved the performance of the k-mean clustering algorithm measured by F-measure and Accuracy.
引用
收藏
页数:6
相关论文
共 19 条
[1]  
Abualigah L. M., 2015, ENHANCING INFORM RET
[2]   University Course Timetabling Using a Hybrid Harmony Search Metaheuristic Algorithm [J].
Al-Betar, Mohammed Azmi ;
Khader, Ahamad Tajudin ;
Zaman, Munir .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (05) :664-681
[3]   A harmony search algorithm for university course timetabling [J].
Al-Betar, Mohammed Azmi ;
Khader, Ahamad Tajudin .
ANNALS OF OPERATIONS RESEARCH, 2012, 194 (01) :3-31
[4]  
Al-Betar MA, 2010, STUD COMPUT INTELL, V270, P147
[5]  
[Anonymous], 2015, Int. J. Comp. Sci. Eng. Appl. (IJCSEA), DOI DOI 10.5121/IJCSEA.2015.5102
[6]   Opposition chaotic fitness mutation based adaptive inertia weight BPSO for feature selection in text clustering [J].
Bharti, Kusum Kumari ;
Singh, Pramod Kumar .
APPLIED SOFT COMPUTING, 2016, 43 :20-34
[7]   Hybrid dimension reduction by integrating feature selection with feature extraction method for text clustering [J].
Bharti, Kusum Kumari ;
Singh, Pramod Kumar .
EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (06) :3105-3114
[8]   A three-stage unsupervised dimension reduction method for text clustering [J].
Bharti, Kusum Kumari ;
Singh, P. K. .
JOURNAL OF COMPUTATIONAL SCIENCE, 2014, 5 (02) :156-169
[9]  
Diao R., 2014, THESIS
[10]  
Forsati Rana, 2008, Wl 2008. 2008 IEEE/WIC/ACM International Conference on Web Intelligence. IAT 2008. 2008 IEEE/WIC/ACM International Conference on Intelligent Agent Technology. Wl-IAT Workshop 2008 2008 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology Workshops, P329, DOI 10.1109/WIIAT.2008.370