TWITTER VS. PRINTED ENGLISH: AN INFORMATION-THEORETIC COMPARISON

被引:0
作者
Glennon, Emma [1 ]
Sankar, Lalitha [1 ]
Poor, H. Vincent [1 ]
机构
[1] Princeton Univ, Dept Elect Engn, Princeton, NJ 08544 USA
来源
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2012年
关键词
Twitter; computer mediated communication; information theory; information entropy; redundancy;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The popular social networking and microblogging service Twitter contains language that is very different from what is considered proper. This paper quantifies those linguistic differences between printed English and Tweetspeak using information-theoretic concepts. Letter-based n-gram entropies are calculated and compared to analagous data from two corpora of printed English to demonstrate that 1) Twitter's entropy is overall higher than that of printed English, and 2) individual users' entropies are on average higher the less conventional their language use is. The implications for digitally-mediated communication in general are also discussed.
引用
收藏
页码:3069 / 3072
页数:4
相关论文
共 10 条
[1]  
[Anonymous], TWEET COLL
[2]  
Chong A., 2009, COMPUTING RES RE DEC
[3]  
Cover T.M., 2006, ELEMENTS INFORM THEO, V2nd ed
[4]  
Craig D., 2003, ESSAYS PROGRAM WRITI, P116
[5]  
Crystal D, 2006, LANGUAGE AND THE INTERNET, 2ND EDITION, P1, DOI 10.2277/ 0521868599
[6]  
Davies Mark., WORD FREQUENCY DATA
[7]  
OKEEFFE KO, 1989, COMPUT HUMANITIES, V23, P459, DOI 10.1007/BF00130034
[8]  
Ruth D., 2010, NEW ORLEANS, V44, P44
[9]   PREDICTION AND ENTROPY OF PRINTED ENGLISH [J].
SHANNON, CE .
BELL SYSTEM TECHNICAL JOURNAL, 1951, 30 (01) :50-64
[10]  
Smith A., 2011, 13% of online adults use twitter: Half of the twitter users access the service on the go via mobile phone