Incorporating self-organizing map with text mining techniques for text hierarchy generation

被引:9
作者
Yang, Hsin-Chang [1 ]
Lee, Chung-Hong [2 ]
Hsiao, Han-Wei [1 ]
机构
[1] Natl Univ Kaohsiung, Dept Informat Management, Kaohsiung, Taiwan
[2] Natl Kaohsiung Univ Appl Sci, Dept Elect Engn, Kaohsiung 807, Taiwan
关键词
Text mining; Self-organizing map; Topic identification; Hierarchy generation; GROWING CELL STRUCTURES; NETWORK; CLASSIFICATION; MODEL;
D O I
10.1016/j.asoc.2015.05.005
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Self-organizing maps (SOM) have been applied on numerous data clustering and visualization tasks and received much attention on their success. One major shortage of classical SOM learning algorithm is the necessity of predefined map topology. Furthermore, hierarchical relationships among data are also difficult to be found. Several approaches have been devised to conquer these deficiencies. In this work, we propose a novel SOM learning algorithm which incorporates several text mining techniques in expanding the map both laterally and hierarchically. On training a set of text documents, the proposed algorithm will first cluster them using classical SOM algorithm. We then identify the topics of each cluster. These topics are then used to evaluate the criteria on expanding the map. The major characteristic of the proposed approach is to combine the learning process with text mining process and makes it suitable for automatic organization of text documents. We applied the algorithm on the Reuters-21578 dataset in text clustering and categorization tasks. Our method outperforms two comparing models in hierarchy quality according to users' evaluation. It also receives better F1-scores than two other models in text categorization task. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:251 / 259
页数:9
相关论文
共 44 条
[1]   Hierarchical classification with a competitive evolutionary neural tree [J].
Adams, RG ;
Butchart, K ;
Davey, N .
NEURAL NETWORKS, 1999, 12 (03) :541-551
[2]   Dynamic self-organizing maps with controlled growth for knowledge discovery [J].
Alahakoon, D ;
Halgamuge, SK ;
Srinivasan, B .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2000, 11 (03) :601-614
[3]  
[Anonymous], NEURAL COMPUTING SUR
[4]  
[Anonymous], AUST J INTELL INFORM
[5]  
[Anonymous], A Self-organizing Maps application
[6]   AUTOMATED LEARNING OF DECISION RULES FOR TEXT CATEGORIZATION [J].
APTE, C ;
DAMERAU, F ;
WEISS, SM .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1994, 12 (03) :233-251
[7]   Topology-oriented self-organizing maps: a survey [J].
Astudillo, Cesar A. ;
Oommen, B. John .
PATTERN ANALYSIS AND APPLICATIONS, 2014, 17 (02) :223-248
[8]  
BLACKMORE J, 1993, 1993 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, P450, DOI 10.1109/ICNN.1993.298599
[9]  
Burzevski V, 1996, IEEE IJCNN, P1658, DOI 10.1109/ICNN.1996.549149
[10]   S-TREE: self-organizing trees for data clustering and online vector quantization [J].
Campos, MM ;
Carpenter, GA .
NEURAL NETWORKS, 2001, 14 (4-5) :505-525