Stylistic constancy and change across literary corpora: Using measures of lexical richness to date works

被引:22
作者
Smith, JA
Kelly, C [1 ]
机构
[1] San Diego State Univ, Dept Math & Comp Sci, San Diego, CA 92182 USA
[2] San Diego State Univ, Dept Class & Humanities, San Diego, CA 92120 USA
来源
COMPUTERS AND THE HUMANITIES | 2002年 / 36卷 / 04期
关键词
chronology; hapax; prediction; vocabulary; Yule's K;
D O I
10.1023/A:1020201615753
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The measure of the lexical richness of literary texts as a tool in the comparative analysis of literary style has been hampered by the problem of the inequality of text lengths within and between literary corpora. This paper proposes an empirical method of description of lexical richness by averaging measures on multiple chunks of text of a standard length within a literary work or corpus. A work's average vocabulary richness, average portion of hapax legomena of the corpus from which it derives, and average repetition of frequently appearing vocabulary may then characterize that work relative to other works partitioned along with it. This method reveals the possibility of significant variance of these measures of vocabulary among works of a single author's corpus and warns against the notion of some absolute authorial stylistic character. We apply this method of vocabulary averaging to the corpora of three playwrights from classical antiquity whose works are chronologically rankable: Euripides, Aristophanes, and Terence. We look for trends in vocabulary richness over time, which we posit functions as an indicator of progressively changing authorial ability or inclination. This method then holds the potential of predicting dates for undateable or tenuously dated works within a corpus of otherwise securely dated texts. From the results derived, a relatively late date for the composition of the redrafted version of Aristophanes' Clouds appears likely; we predict an early composition date for the redraft of Terence's Hecyra (and thus are inclined to think that the playwright did very little redrafting); and finally we find Euripides' Electra and Supplices exhibiting vocabulary characteristics of extremely late composition and we predict dates much later than those assigned based on metrical considerations.
引用
收藏
页码:411 / 430
页数:20
相关论文
共 17 条
  • [1] Baayen H., 1996, Literary & Linguistic Computing, V11, P121, DOI 10.1093/llc/11.3.121
  • [2] BAAYEN H, 1991, YB MORPHOLOGY, P109
  • [3] Cropp M., 1985, Resolutions and Chronology in Euripides: The Fragmentary Tragedies
  • [4] Devine A., 1981, T AM PHILOLOGICAL AS, V111, P45
  • [5] DIGGLE J, 1994, EURIPIDES FABULAE TO, V3
  • [6] Draper N. R., 1966, APPL REGRESSION ANAL
  • [7] Duckworth G.E., 1952, NATURE ROMAN COMEDY
  • [8] SENSE-PAUSES AND RELATIVE DATING IN SENECA, SOPHOCLES AND SHAKESPEARE
    FITCH, JG
    [J]. AMERICAN JOURNAL OF PHILOLOGY, 1981, 102 (03) : 289 - 307
  • [9] Frischer B., 1991, Shifting Paradigms: New Approaches to Horace's Ars Poetica
  • [10] Ireland S., 1990, TERENCE MOTHER IN LA