Combining homogeneous classifiers for centroid-based text classification

被引:3
|
作者
Lertnattee, V [1 ]
Theeramunkong, T [1 ]
机构
[1] Thammasat Univ, Sirindhorn Int Inst Technol, Informat Technol Program, Pathum Thani 12121, Thailand
来源
ISCC 2002: SEVENTH INTERNATIONAL SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS, PROCEEDINGS | 2002年
关键词
D O I
10.1109/ISCC.2002.1021799
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Centroid-based text classification is one of the most popular supervised approaches to classify texts into a set of pre-defined classes. Based on the vector-space model, the performance of this classification particularly depends on the way to weight and select important terms in documents for constructing a prototype class vector for each class. In the past, it was shown that term weighting using statistical term distributions could improve classification accuracy. However, for different data sets, the best weighting systems are different. Towards this problem, we propose a method that uses homogenous centroid-based classification. The effectiveness of this approach is explored using four data sets. Two main factors are taken into account: model selection and score combination. By experiments, the results show that our system can improve classification accuracy zip to 7.5-8.5% comparing to k-NN classifier, 3.7-4.0% comparing with naive Bayes classifier and 1.6-2.7% over the best single-model classification method (p<0.05).
引用
收藏
页码:1034 / 1039
页数:6
相关论文
共 50 条
  • [1] An improvement of centroid-based classification algorithm for text classification
    Cataltepe, Zehra
    Aygun, Eser
    2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOP, VOLS 1-2, 2007, : 952 - 956
  • [2] Supervised term weighting centroid-based classifiers for text categorization
    Nguyen, Tam T.
    Chang, Kuiyu
    Hui, Siu Cheung
    KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 35 (01) : 61 - 85
  • [3] Supervised term weighting centroid-based classifiers for text categorization
    Tam T. Nguyen
    Kuiyu Chang
    Siu Cheung Hui
    Knowledge and Information Systems, 2013, 35 : 61 - 85
  • [4] A new Centroid-Based Classification model for text categorization
    Liu, Chuan
    Wang, Wenyong
    Tu, Guanghui
    Xiang, Yu
    Wang, Siyang
    Lv, Fengmao
    KNOWLEDGE-BASED SYSTEMS, 2017, 136 : 15 - 26
  • [5] Analysis of inverse class frequency in centroid-based text classification
    Lertnattee, V
    Theeramunkong, T
    IEEE INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES 2004 (ISCIT 2004), PROCEEDINGS, VOLS 1 AND 2: SMART INFO-MEDIA SYSTEMS, 2004, : 1171 - 1176
  • [6] Centroid-Based Classification of Categorical Data
    Chen, Lifei
    Guo, Gongde
    WEB-AGE INFORMATION MANAGEMENT, WAIM 2014, 2014, 8485 : 472 - 475
  • [7] Class normalization in centroid-based text categorization
    Lertnattee, Verayuth
    Theeramunkong, Thanaruk
    INFORMATION SCIENCES, 2006, 176 (12) : 1712 - 1738
  • [8] A Framework of Centroid-Based Methods for Text Categorization
    Wang, Dandan
    Chen, Qingcai
    Wang, Xiaolong
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (02): : 245 - 254
  • [9] A New Centroid-Based Classifier for Text Categorization
    Chen, Lifei
    Ye, Yanfang
    Jiang, Qingshan
    2008 22ND INTERNATIONAL WORKSHOPS ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOLS 1-3, 2008, : 1217 - +
  • [10] Semi-supervised Single-label Text Categorization using Centroid-based Classifiers
    Cardoso-Cachopo, Ana
    Oliveira, Arlindo L.
    APPLIED COMPUTING 2007, VOL 1 AND 2, 2007, : 844 - +