Authorship Attribution: A Comparative Study of Three Text Corpora and Three Languages

被引:14
作者
Savoy, Jacques [1 ]
机构
[1] Univ Neuchatel, Dept Comp Sci, CH-2000 Neuchatel, Switzerland
关键词
DELTA;
D O I
10.1080/09296174.2012.659003
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
The first objective of this paper is carry out three experiments intended to evaluate authorship attribution methods based on three test-collections available in three different languages (English, French, and German). In the first we represent and categorize 52 text excerpts written by nine authors and taken from 19th century English novels. In the second we work with 44 segments from French novels written by eleven authors, mostly from the 19th century. In the third we extract 59 German text excerpts from novels published mainly during the 19th and the beginning of the 20th century, written by 15 authors. The second objective is to analyse performance differences obtained when using word types or lemmas as text representations, and the third objective is to evaluate three authorship attribution schemes, the first of which uses principal component analysis (PCA), the second applies the Delta approach, and the third corresponds to a new authorship attribution method based on specific vocabulary. This concept is computed for a given text (or author profile) and then compared with the entire corpus. Based on this information, we show how a distance measure can be derived and by means of the nearest neighbor approach we suggest a simple and efficient authorship attribution scheme. Based on three test collections and using either word types or lemmas as features, we demonstrate that the suggested classification scheme performs better than the PCA method, and slightly better than the Delta approach.
引用
收藏
页码:132 / 161
页数:30
相关论文
共 8 条
  • [1] A comparative study of machine learning methods for authorship attribution
    Jockers, Matthew L.
    Witten, Daniela M.
    LITERARY AND LINGUISTIC COMPUTING, 2010, 25 (02): : 215 - 223
  • [2] A comparative study on epidemiological characteristics, transmissibility, and pathogenicity of three COVID-19 outbreaks caused by different variants
    Liu, Chan
    Lu, Jianhua
    Li, Peihua
    Feng, Siyang
    Guo, Yichao
    Li, Kangguo
    Zhao, Benhua
    Su, Yanhua
    Chen, Tianmu
    Zou, Xuan
    INTERNATIONAL JOURNAL OF INFECTIOUS DISEASES, 2023, 134 : 78 - 87
  • [3] Comparative Susceptibilities of Different Life Stages of the Tarnished Plant Bug (Hemiptera: Miridae) to Three Classes of Insecticide
    Allen, K. Clint
    Jackson, Ryan E.
    Snodgrass, Gordon L.
    Musser, Fred R.
    SOUTHWESTERN ENTOMOLOGIST, 2012, 37 (03) : 271 - 280
  • [4] Impact Study of Impoundment of the Three Gorges Reservoir on Salt-Water Dynamics and Soil Salinity in the Yangtze River Estuary
    Xie, W. P.
    Yang, J. S.
    Yao, R. J.
    Wang, X. P.
    JOURNAL OF ENVIRONMENTAL INFORMATICS, 2020, 36 (01) : 11 - 23
  • [5] Impact of the operation of a large-scale reservoir on downstream river channel geomorphic adjustments: A case study of the Three Gorges
    Yang, Yunping
    Zhang, Mingjin
    Zhu, Lingling
    Zhang, Huaqing
    Liu, Wanli
    Wang, Jianjun
    RIVER RESEARCH AND APPLICATIONS, 2018, 34 (10) : 1315 - 1327
  • [6] Is aquaculture development responsible for mangrove conversion in India?-A geospatial study to assess the influence of natural and anthropogenic factors on mangroves in the last three decades
    Jayanthi, M.
    Samynathan, M.
    Thirumurthy, S.
    Duraismay, M.
    Kabiraj, S.
    Vijayakumar, S.
    Panigrahi, A.
    Kumaran, M.
    Muralidhar, M.
    AQUACULTURE, 2022, 561
  • [7] Using three different approaches of OSL for the study of young fluvial sediments at the coastal plain of the Usumacinta-Grijalva River Basin, southern Mexico
    Munoz-Salinas, Esperanza
    Castillo, Miguel
    Sanderson, David
    Kinnaird, Tim
    Cruz-Zaragoza, Epifanio
    EARTH SURFACE PROCESSES AND LANDFORMS, 2016, 41 (06) : 823 - 834
  • [8] Three-dimensional surface deformation from multi-track InSAR and oil reservoir characterization: A case study in the Liaohe Oilfield, northeast China
    Tang, Wei
    Gong, Zhiqiang
    Sun, Xiubo
    Liu, Yu 'an
    Motagh, Mahdi
    Li, Zhicai
    Li, Jing
    Malinowska, Agnieszka
    Jiang, Jinbao
    Wei, Lianhuan
    Zhang, Xin
    Wei, Xing
    Li, Hui
    Geng, Xu
    INTERNATIONAL JOURNAL OF ROCK MECHANICS AND MINING SCIENCES, 2024, 174