A rough set-based case-based reasoner for text categorization

被引:34
|
作者
Li, Y
Shiu, SCK [1 ]
Pal, SK
Liu, JNK
机构
[1] Hong Kong Polytech Univ, Dept Comp, Kowloon, Hong Kong, Peoples R China
[2] Indian Stat Inst, Machine Intelligence Unit, Kolkata 700035, W Bengal, India
关键词
text categorization (TC); case-based reasoning (CBR); rough set; case coverage; case reachability;
D O I
10.1016/j.ijar.2005.06.019
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a novel rough set-based case-based reasoner For Use in text categorization (TC). The reasoner has four main components: feature term extractor, document representor, case selector, and case retriever. It operates by first reducing the number of feature terms in the documents Using the rough set technique. Then, the number of documents is reduced using a new document selection approach based on the case-based reasoning (CBR) concepts of coverage and reachability. As a result, both the number of feature terms and documents are reduced with only minimal loss of information. Finally, this smaller set of documents with fewer feature terms is Used in TC. The proposed rough set-based case-based reasoner wits tested on the Reuters21578 text datasets. The experimental results demonstrate its effectiveness and efficiency as it significantly reduced feature terms and documents, important for improving the efficiency of TC, while preserving and even improving classification accuracy. (C) 2005 Elsevier Inc. All rights reserved.
引用
收藏
页码:229 / 255
页数:27
相关论文
共 50 条
  • [11] The Research of Tax Text Categorization based on Rough Set
    Liu, Bin
    Xu, Guang
    Xu, Qian
    Zhang, Nan
    2012 INTERNATIONAL CONFERENCE ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING (ICMPBE2012), 2012, 33 : 1683 - 1688
  • [12] A rough set-based corporate memory for the case of ecotourism
    Huang, Chun-Che
    Liang, Wen-Yau
    Tseng, Tzu-Liang
    Wong, Ruo-Yin
    TOURISM MANAGEMENT, 2015, 47 : 22 - 33
  • [13] A semantic case-based reasoning framework for text categorization
    Ceausu, Valentina
    Despres, Sylvie
    SEMANTIC WEB, PROCEEDINGS, 2007, 4825 : 736 - +
  • [14] An Algorithm for Case-Based Reasoning Based on Similarity Rough Set
    Ji, Sai
    Yuan, Shen-fang
    Wang, Shui-ping
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 5, PROCEEDINGS, 2008, : 226 - +
  • [15] Method of Chinese Text Categorization Based On Variable Precision Rough Set
    Wang, Ming-Yan
    Liu, Ting
    IITAW: 2009 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATIONS WORKSHOPS, 2009, : 26 - 29
  • [16] A rough set-based fuzzy clustering
    Zhao, YQ
    Zhou, XZ
    Tang, GZ
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2005, 3689 : 401 - 409
  • [17] PERFORMANCE ANALYSIS OF ROUGH SET-BASED IN THE CASE OF MISSING VALUES
    Nowicki, Robert K.
    Seliga, Robert
    Zelasko, Dariusz
    Hayashi, Yoichi
    JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2021, 11 (04) : 307 - 318
  • [18] Rough set approach to case-based reasoning application
    Huang, CC
    Tseng, TL
    EXPERT SYSTEMS WITH APPLICATIONS, 2004, 26 (03) : 369 - 385
  • [19] Effective Text Classification Through Supervised Rough Set-Based Term Weighting
    Cekik, Rasim
    SYMMETRY-BASEL, 2025, 17 (01):
  • [20] A large case-based reasoner for legal cases
    Weber-Lee, R
    Barcia, RM
    da Costa, MC
    Rodrigues, IW
    Hoeschl, HC
    Bueno, TCD
    Martins, A
    Pacheco, RC
    CASE-BASED REASONING RESEARCH AND DEVELOPMENT, 1997, 1266 : 190 - 199