Term-length normalization for centroid-based text categorization

被引:0
作者
Lertnattee, V [1 ]
Theeramunkong, T [1 ]
机构
[1] Thammasat Univ, Sirindhorn Int Inst Tehcnol, Informat Technol Program, Pathum Thani 12121, Thailand
来源
KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS | 2003年 / 2773卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Centroid-based categorization is one of the most popular algorithms in text classification. Normalization is an important factor to improve performance of a centroid-based classifier when documents in text collection have quite different sizes. In the past, normalization involved with only document- or class-length normalization. In this paper, we propose a new type of normalization called term-length normalization which considers term. distribution in a class. The performance of this normalization is investigated in three environments of a standard centroid-based classifier (TFIDF): (1) without class-length normalization, (2) with cosine class-length normalization and (3) with summing weight normalization. The results suggest that our term-length normalization is useful for improving classification accuracy in all cases.
引用
收藏
页码:850 / 856
页数:7
相关论文
共 50 条
  • [31] A Weighted Method to Improve the Centroid-based Classifier
    Liu, Chuan
    Wang, Wen-yong
    Tu, Guang-hui
    Liu, Nan-nan
    Xiang, Yu
    [J]. 2016 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND AUTOMATION (ICEEA 2016), 2016,
  • [32] Centroid-based robust audio watermarking scheme
    Fan, Mingquan
    Wang, Hongxia
    [J]. 2008 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING, VOLS 1 AND 2, PROCEEDINGS, 2008, : 476 - 479
  • [33] Centroid-Based Multiple Local Community Detection
    Li, Boyu
    Kamuhanda, Dany
    He, Kun
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (01) : 455 - 464
  • [34] Centroid-based focused crawler with incremental ability
    Wang, Hui
    Zuo, Wanli
    Wang, Huiyu
    Ning, Aijun
    Sun, Zhiwei
    Man, Chunlei
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2009, 46 (02): : 217 - 224
  • [36] A Centroid-Based Automatic Image Registration Method
    Peng ZHENG
    Keni ZHENG
    Xiquan SHI
    [J]. JournalofMathematicalResearchwithApplications, 2019, 39 (06) : 619 - 632
  • [37] An algorithm for centroid-based tracking of moving objects
    Nascimento, JC
    Abrantes, AJ
    Marques, JS
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 3305 - 3308
  • [38] Centroid-based sifting for empirical mode decomposition
    Hong HONG Xinlong WANG Zhiyong TAO Shuanping DU Key Laboratory of Modern Acoustics and Institute of Acoustics Nanjing University Nanjing China State Key Laboratory of Ocean Acoustics Hangzhou Applied Acoustics Research Institute Hangzhou China
    [J]. JournalofZhejiangUniversity-ScienceC(Computers&Electronics), 2011, 12 (02) : 88 - 95
  • [39] Centroid-based sifting for empirical mode decomposition
    Hong, Hong
    Wang, Xin-long
    Tao, Zhi-yong
    Du, Shuan-ping
    [J]. JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2011, 12 (02): : 88 - 95
  • [40] Shape tracking using centroid-based methods
    Abrantes, AJ
    Marques, JS
    [J]. ENERGY MINIMIZATION METHODS IN COMPUTER VISION AND PATTERN RECOGNITION, 2001, 2134 : 576 - 591