Improvement in Automatic Classification of Persian Documents by Means of Support Vector Machine and Representative Vector

被引：0

作者：

Ashkan, Jafari ^{[1
]}

Hamed, Ezadi ^{[2
]}

Mihan, Hossennejad ^{[1
]}

Taher, Noohi ^{[3
]}

机构：

[1] Islamic Azad Univ, Jolfa Branch, Jolfa, Iran

[2] Islamic Azad Univ, Eghlid Branch, Eghlid, Iran

[3] Islamic Azad Univ, Najafabad Branch, Najafabad, Iran

来源：

INNOVATIVE COMPUTING TECHNOLOGY | 2011年 / 241卷

关键词：

Documents Classification; Representative Vector; Stemming; Support Vector Machine; CATEGORIZATION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Representative Vector is a kind of Vector which includes related words and the degree of their relationships. In this paper the effect of using this kind of Vector on automatic classification of Persian documents is examined. In this method, preprocessed documents, extra words as well as word stems are at first found. Next, through one of the known ways, some features are extracted for each category. Then. the Representative Vector, which is made based on the elicited features, leads to some more detailed words which are better Representatives for each category. Findings of the experiments show that Precision and Recall can be increased significantly by extra words omission and addition of few words in the Representative Vectors as well as the use of a famous classification model like Support Vector Machine (SVM).

引用

页码：282 / +

页数：4

共 36 条

[1]

Amiri H., 2008, ECIR 2008 WORKSH EXP

[2]

[Anonymous], 2010, INT J COMPUTATIONAL, V1, P12

[3]

Basin M., 2007, 13 NAT C COMP FOR KI

[4]

Bina B., 2007, 13 NAT C COMP FOR KI