Combining word based and word co-occurrence based sequence analysis for text categorization

被引:0
|
作者
Luo, X [1 ]
Zincir-Heywood, AN [1 ]
机构
[1] Dalhousie Univ, Fac Comp Sci, Halifax, NS, Canada
来源
PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7 | 2004年
关键词
text categorization; sequence analysis; self organization map;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper represents a text categorization system, which is based on the combination of a hierarchical SOMs encoding architecture and the designed kNN classifier. Through the encoding architecture, a document can be encoded to sequences of neurons so that the sequences of word/word co-occurrence as well as their frequencies are kept. A good performance (Micro average F1-measure 0.98) is achieved on the experimental data set by using this system. This sequence analysis system for text categorization could automatically solve the high dimensionality problem for large data set. And it could be utilized for other data categorization where sequences information is significant and important.
引用
收藏
页码:1580 / 1585
页数:6
相关论文
共 50 条
  • [1] Word co-occurrence features for text classification
    Figueiredo, Fabio
    Rocha, Leonardo
    Couto, Thierson
    Salles, Thiago
    Goncalves, Marcos Andre
    Meira, Wagner, Jr.
    INFORMATION SYSTEMS, 2011, 36 (05) : 843 - 858
  • [2] An Automatic Image Tagging Based on Word Co-Occurrence Analysis
    Abdulraheem, Ali
    Zakaria, Lailatul Qadri
    2018 FOURTH INTERNATIONAL CONFERENCE ON INFORMATION RETRIEVAL AND KNOWLEDGE MANAGEMENT (CAMP), 2018, : 49 - 53
  • [3] Text Similarity Computing Based on LDA Topic Model and Word Co-occurrence
    Shao, Minglai
    Qin, Liangxi
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, KNOWLEDGE ENGINEERING AND INFORMATION ENGINEERING (SEKEIE 2014), 2014, 114 : 199 - 203
  • [4] Text Clustering Algorithm Based on the Graph Structures of Semantic Word Co-occurrence
    Jin, Chun-Xia
    Bai, Qiu-Chan
    2016 INTERNATIONAL CONFERENCE ON INFORMATION SYSTEM AND ARTIFICIAL INTELLIGENCE (ISAI 2016), 2016, : 497 - 502
  • [5] Text categorization based on term co-occurrence concept
    Ni, Maoshu
    Lin, Hongfei
    RECENT ADVANCE OF CHINESE COMPUTING TECHNOLOGIES, 2007, : 222 - 225
  • [6] Word co-occurrence augmented topic model in short text
    Chen, Guan-Bin
    Kao, Hung-Yu
    INTELLIGENT DATA ANALYSIS, 2017, 21 : S55 - S70
  • [7] Improving automatic image annotation based on word co-occurrence
    Jair Escalante, H.
    Montes, Manuel
    Enrique Sucar, L.
    ADAPTIVE MULTIMEDIAL RETRIEVAL: RETRIEVAL, USER, AND SEMANTICS, 2008, 4918 : 57 - 70
  • [8] A word co-occurrence matrix based method for relevance feedback
    Chen, Zilong
    Lu, Yang
    Journal of Computational Information Systems, 2011, 7 (01): : 17 - 24
  • [9] AN OCCURRENCE-BASED MODEL OF WORD CATEGORIZATION
    BENSCH, PA
    SAVITCH, WJ
    ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 1995, 14 (01) : 1 - 16
  • [10] Exploiting Word and Visual Word Co-occurrence for Sketch-based Clipart Image Retrieval
    Liu, Ching-Hsuan
    Lin, Yen-Liang
    Cheng, Wen-Feng
    Hsu, Winston H.
    MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, : 867 - 870