A rough set-based case-based reasoner for text categorization

被引:34
|
作者
Li, Y
Shiu, SCK [1 ]
Pal, SK
Liu, JNK
机构
[1] Hong Kong Polytech Univ, Dept Comp, Kowloon, Hong Kong, Peoples R China
[2] Indian Stat Inst, Machine Intelligence Unit, Kolkata 700035, W Bengal, India
关键词
text categorization (TC); case-based reasoning (CBR); rough set; case coverage; case reachability;
D O I
10.1016/j.ijar.2005.06.019
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a novel rough set-based case-based reasoner For Use in text categorization (TC). The reasoner has four main components: feature term extractor, document representor, case selector, and case retriever. It operates by first reducing the number of feature terms in the documents Using the rough set technique. Then, the number of documents is reduced using a new document selection approach based on the case-based reasoning (CBR) concepts of coverage and reachability. As a result, both the number of feature terms and documents are reduced with only minimal loss of information. Finally, this smaller set of documents with fewer feature terms is Used in TC. The proposed rough set-based case-based reasoner wits tested on the Reuters21578 text datasets. The experimental results demonstrate its effectiveness and efficiency as it significantly reduced feature terms and documents, important for improving the efficiency of TC, while preserving and even improving classification accuracy. (C) 2005 Elsevier Inc. All rights reserved.
引用
收藏
页码:229 / 255
页数:27
相关论文
共 50 条
  • [1] A rough set-based case-based reasoner for text categorization
    Li, Y.
    Shiu, S.C.K.
    Pal, S.K.
    Liu, J.N.K.
    International Journal of Approximate Reasoning, 2006, 41 (02): : 229 - 255
  • [2] A rough set-based hybrid method to text categorization
    Bao, Y
    Aoyama, S
    Du, XY
    Yamada, K
    Ishii, N
    SECOND INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS ENGINEERING, VOL I, PROCEEDINGS, 2002, : 254 - 261
  • [3] Fuzzy Rough Set-Based Unstructured Text Categorization
    Bharadwaj, Aditya
    Ramanna, Sheela
    ADVANCES IN ARTIFICIAL INTELLIGENCE, CANADIAN AI 2017, 2017, 10233 : 335 - 340
  • [4] Rough Set-based SVM Classifier for Text Categorization
    Chen, Peng
    Liu, Shuang
    ICNC 2008: FOURTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 2, PROCEEDINGS, 2008, : 153 - +
  • [5] A Rough Set-based Reasoner for medical diagnosis
    Ghany, Kareem Kamal A.
    Ayeldeen, Heba
    Zawbaa, Hossam M.
    Shaker, Olfat
    2015 INTERNATIONAL CONFERENCE ON GREEN COMPUTING AND INTERNET OF THINGS (ICGCIOT), 2015, : 429 - 434
  • [6] Rough set feature selection methods for case-based categorization of text documents
    Gupta, KM
    Moore, PG
    Aha, DW
    Pal, SK
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2005, 3776 : 792 - 798
  • [7] A rough set-based CBR approach for feature and document reduction in text categorization
    Li, Y
    Shiu, SCK
    Pal, SK
    Liu, JNK
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 2438 - 2443
  • [8] A rough set-based approach to text classification
    Chouchoulas, A
    Shen, Q
    NEW DIRECTIONS IN ROUGH SETS, DATA MINING, AND GRANULAR-SOFT COMPUTING, 1999, 1711 : 118 - 127
  • [9] Partition for the rough set-based text classification
    Bao, YG
    Asai, D
    Du, XY
    Ishii, N
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2003, 2762 : 181 - 188
  • [10] An effective rough set-based method for text classification
    Bao, YG
    Asai, D
    Du, XY
    Yamada, K
    Ishii, N
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING, 2003, 2690 : 545 - 552