Authorship identification based on support vector machine

被引:0
作者
Yoshida, A [1 ]
Nobesawa, S [1 ]
Sato, K [1 ]
Saito, H [1 ]
机构
[1] Keio Univ, Dept Informat & Comp Sci, Yokohama, Kanagawa 2238522, Japan
来源
6TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL III, PROCEEDINGS: IMAGE, ACOUSTIC, SPEECH AND SIGNAL PROCESSING I | 2002年
关键词
authorship identification; features; support vector machine;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In recent years, various techniques and features are used in research of authorship identification. We propose a method of authorship identification by learning features of each author based on support vector machine (SVM). SVM achieves high generalization performance even with input data of very high dimensional feature space. In addition, by introducing a kernel function, SVM can carry out a training ill high-dimensional spaces with a smaller computational cost independent of their dimensionality. We adopted I 1 features and had an experiment to identify 5 Japanese writers to confirm effectiveness of our method. As a result, we achieve the accuracy of more than 80%. In addition, we found that the frequency of the consecutive characters and kinds of characters are effective features in authorship identification.
引用
收藏
页码:423 / 428
页数:6
相关论文
共 6 条
[1]  
[Anonymous], HDB NATURAL LANGUAGE
[2]   The design and synthesis of a potent Angiotensin II cyclic analogue confirms the ring cluster receptor conformation of the hormone Angiotensin II [J].
Matsoukas, JM ;
Polevaya, L ;
Ancans, J ;
Mavromoustakos, T ;
Kolocouris, A ;
Roumelioti, P ;
Vlahakos, DV ;
Yamdagni, R ;
Wu, Q ;
Moore, GJ .
BIOORGANIC & MEDICINAL CHEMISTRY, 2000, 8 (01) :1-10
[3]  
MATSUURA T, 1999, IPSJ SIGNL JAPANESE, V134, P31
[4]  
MURAKAMI M, 1994, SCI AUTHENTICITY
[5]  
Vapnik V, 1999, NATURE STAT LEARNING
[6]  
YOSHIDA A, 2001, IPSJ SIGNL JAPANESE, P80