Chinese text categorization based on CCIPCA and SMO

被引:1
作者
Li, Xin-Fu [1 ]
He, Hai-Bin [1 ]
Zhao, Lei-Lei [1 ]
机构
[1] Hebei Univ, Coll Math & Comp Sci, Baoding 071002, Peoples R China
来源
PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7 | 2008年
关键词
text categorization; dimension reduction; candid incremental principal component analysis (CCIPCA); sequential minimization optimization algorithm (SMO);
D O I
10.1109/ICMLC.2008.4620831
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Vector space model is usually used to express text for text categorization. How to reduce the dimensionality of feature space is a very key problem for practical text classification. The classical decomposition algorithms are incapable of dealing with the high-dimensional and large-scale text categorization problems. In this paper an approach to improving the performance of text categorization is presented by using candid incremental principal component analysis and sequential minimization optimization algorithm. The experimental result shows that the proposed method for Chinese text categorization is practicable and effective.
引用
收藏
页码:2514 / 2518
页数:5
相关论文
共 10 条
  • [1] [高茂庭 GAO Maoting], 2006, [计算机工程与应用, Computer Engineering and Application], V42, P157
  • [2] [何建兵 He Jianbing], 2006, [计算机工程与应用, Computer Engineering and Application], V42, P152
  • [3] LI C, 2003, J NW U NATURAL SCI E, V33, P267
  • [4] SONG XF, 2007, P CHIN BIOM ENG XI A, P1352
  • [5] [苏金树 SU JinShu], 2006, [软件学报, Journal of Software], V17, P1848, DOI 10.1360/jos171848
  • [6] TAO C, 2005, J CHINA SOC SCI TECH, V24, P690
  • [7] Vapnik V, 2000, NATURE STAT LEARNING
  • [8] Candid covariance-free incremental principal component analysis
    Weng, JY
    Zhang, YL
    Hwang, WS
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2003, 25 (08) : 1034 - 1040
  • [9] WU JJ, 2007, NATURAL SCI J HAINAN, V25, P62
  • [10] ZHENG Y, 2001, MSUCSE0123 DEP COMP