Improvement in Automatic Classification of Persian Documents by Means of Support Vector Machine and Representative Vector

被引:0
作者
Ashkan, Jafari [1 ]
Hamed, Ezadi [2 ]
Mihan, Hossennejad [1 ]
Taher, Noohi [3 ]
机构
[1] Islamic Azad Univ, Jolfa Branch, Jolfa, Iran
[2] Islamic Azad Univ, Eghlid Branch, Eghlid, Iran
[3] Islamic Azad Univ, Najafabad Branch, Najafabad, Iran
来源
INNOVATIVE COMPUTING TECHNOLOGY | 2011年 / 241卷
关键词
Documents Classification; Representative Vector; Stemming; Support Vector Machine; CATEGORIZATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Representative Vector is a kind of Vector which includes related words and the degree of their relationships. In this paper the effect of using this kind of Vector on automatic classification of Persian documents is examined. In this method, preprocessed documents, extra words as well as word stems are at first found. Next, through one of the known ways, some features are extracted for each category. Then. the Representative Vector, which is made based on the elicited features, leads to some more detailed words which are better Representatives for each category. Findings of the experiments show that Precision and Recall can be increased significantly by extra words omission and addition of few words in the Representative Vectors as well as the use of a famous classification model like Support Vector Machine (SVM).
引用
收藏
页码:282 / +
页数:4
相关论文
共 36 条
[1]  
Amiri H., 2008, ECIR 2008 WORKSH EXP
[2]  
[Anonymous], 2010, INT J COMPUTATIONAL, V1, P12
[3]  
Basin M., 2007, 13 NAT C COMP FOR KI
[4]  
Bina B., 2007, 13 NAT C COMP FOR KI
[5]   Fast and accurate text classification via multiple linear discriminant projections [J].
Chakrabarti, S ;
Roy, S ;
Soundalgekar, MV .
VLDB JOURNAL, 2003, 12 (02) :170-185
[6]  
Christopher D., 2008, INTRO INFORM RETRIEV
[7]  
Darrudi E., 2004, 2 INT WORKSH INF TEC
[8]  
Dasgupta A, 2007, KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P230
[9]   On the optimality of the simple Bayesian classifier under zero-one loss [J].
Domingos, P ;
Pazzani, M .
MACHINE LEARNING, 1997, 29 (2-3) :103-130
[10]  
Esmail Pour M., 2006, LIB INFORM UPDATE, V10