Supporting exploratory text analysis in literature study

被引:16
作者
Muralidharan, Aditi [1 ]
Hearst, Marti A. [1 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
来源
LITERARY AND LINGUISTIC COMPUTING | 2013年 / 28卷 / 02期
关键词
D O I
10.1093/llc/fqs044
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
We present WordSeer, an exploratory analysis environment for literary text. Literature study is a cycle of reading, interpretation, exploration, and understanding. While there is now abundant technological support for reading and interpreting literary text in new ways through text-processing algorithms, the other parts of the cycle-exploration and understanding-have been relatively neglected. We are motivated by the literature on sensemaking, an area of computer science devoted to supporting open-ended analysis on large collections of data. Our software system integrates tools for algorithmic processing of text with interaction techniques that support the interpretive, exploratory, and note-taking aspects of scholarship. At present, the system supports grammatical search and contextual similarity determination, visualization of patterns of word context, and examination and organization of the source material for comparison and hypothesis building. This article illustrates its capabilities by analyzing language-use differences between male and female characters in Shakespeare's plays. We find that when love is a major plot point, the language Shakespeare uses to refer to women becomes more physical, and the language referring to men becomes more sentimental. Future work will incorporate additional sensemaking tools to aid comparison, exploration, grouping, and pattern recognition.
引用
收藏
页码:283 / 295
页数:13
相关论文
共 27 条
[1]  
Al-Malki A., 2012, Arab women in Arab news: Old stereotypes and new media
[2]  
Card S.K, 1993, P INT CHI C HUM FACT
[3]   'A thing not beginning and not ending': using digital tools to distant-read Gertrude Stein's The Making of Americans [J].
Clement, Tanya E. .
LITERARY AND LINGUISTIC COMPUTING, 2008, 23 (03) :361-381
[4]  
Eick S.G., 1994, Journal of Computational and Graphical Statistics, V3, P127
[5]  
Fekete J.-D., 2000, ACM 2000. Digital Libraries. Proceedings of the Fifth ACM Conference on Digital Libraries, P47, DOI 10.1145/336597.336632
[6]  
Hearst M. A., 2009, SEARCH USER INTERFAC, P157
[7]  
Hope Jonathan., 2004, EARLY MODEM LIT STUD, V9, P1
[8]  
Ishizaki S., 2011, Applied Natural Language Processing: Identification, Investigation and Resolution, IGI Global, P276, DOI DOI 10.4018/978-1-60960-741-8.CH016
[9]  
Islam A., 2008, ACM Transactions on Knowledge Discovery from Data (TKDD), V2, P10, DOI [10.1145/1376815.1376819, DOI 10.1145/1376815.1376819]
[10]  
Jurafsky D., 2009, Speech and language processing, V2nd, P427