A classifier system for author recognition using synonym-based features

被引:0
作者
Clark, Jonathan H. [1 ]
Hannon, Charles J. [1 ]
机构
[1] Texas Christian Univ, Dept Comp Sci, Ft Worth, TX 76129 USA
来源
MICAI 2007: ADVANCES IN ARTIFICIAL INTELLIGENCE | 2007年 / 4827卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The writing style of an author is a phenomenon that computer scientists and stylometrists have modeled in the past with some success. However, due to the complexity and variability of writing styles, simple models often break down when faced with real world data. Thus, current trends in stylometry often employ hundreds of features in building classifier systems. In this paper, we present a novel set of synonym-based features for author recognition. We outline a basic model of how synonyms relate to an author's identify and then build an additional two models refined to meet real world needs. Experiments show strong correlation between the presented metric and the writing style of four authors with the second of the three models outperforming the others. As modern stylometric classifier systems demand increasingly larger feature sets, this new set of synonym-based features will serve to fill this ever-increasing need.
引用
收藏
页码:839 / +
页数:3
相关论文
共 15 条
  • [1] BRINEGAR CS, 1963, J AM STAT ASS, V58
  • [2] CLARK JH, 2007, ALGORITHM IDENTIFYIN
  • [3] FUCKS W, 1952, BIOMETRIKA, V39, P122, DOI 10.2307/2332470
  • [4] Glover A, 1996, DETECTING STYLISTIC
  • [5] Coh-Metrix: Analysis of text on cohesion and language
    Graesser, AC
    McNamara, DS
    Louwerse, MM
    Cai, ZQ
    [J]. BEHAVIOR RESEARCH METHODS INSTRUMENTS & COMPUTERS, 2004, 36 (02): : 193 - 202
  • [6] HOLMES DI, 1994, AUTHORSHIP ATTRIBUTI, V28
  • [7] Khmelev D. V., 2001, Literary & Linguistic Computing, V16, P299, DOI 10.1093/llc/16.3.299
  • [8] Mannion D., 2004, Literary & Linguistic Computing, V19, P497, DOI 10.1093/llc/19.4.497
  • [9] WORDNET - A LEXICAL DATABASE FOR ENGLISH
    MILLER, GA
    [J]. COMMUNICATIONS OF THE ACM, 1995, 38 (11) : 39 - 41
  • [10] PENG F, 2004, 11 C EUR CHAP ASS CO