@Phillies Tweeting from Philly? Predicting Twitter User Locations with Spatial Word Usage

被引:71
作者
Chang, Hau-Wen [1 ]
Lee, Dongwon [1 ]
Eltaher, Mohammed [2 ]
Lee, Jeongkyu [2 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
[2] Univ Bridgeport, Bridgeport, CT 06601 USA
来源
2012 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM) | 2012年
基金
美国国家科学基金会;
关键词
D O I
10.1109/ASONAM.2012.29
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study the problem of predicting home locations of Twitter users using contents of their tweet messages. Using three probability models for locations, we compare both the Gaussian Mixture Model (GMM) and the Maximum Likelihood Estimation (MLE). In addition, we propose two novel unsupervised methods based on the notions of Non-Localness and Geometric-Localness to prune noisy data from tweet messages. In the experiments, our unsupervised approach improves the baselines significantly and shows comparable results with the supervised state-of-the-art method. For 5,113 Twitter users in the test set, on average, our approach with only 250 selected local words or less is able to predict their home locations (within 100 miles) with the accuracy of 0.499, or has 509.3 miles of average error distance at best.
引用
收藏
页码:111 / 118
页数:8
相关论文
共 13 条
[1]  
[Anonymous], 1971, The SMART Retrieval System-Experiments in Automatic Document Processing
[2]  
[Anonymous], 2011, PROC ACM INT C MULTI
[3]  
Backstrom Lars., 2008, P 17 INT C WORLD WID, P357, DOI DOI 10.1145/1367497.1367546
[4]  
Chandra S., 2011, Proceedings of the 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and IEEE Third International Conference on Social Computing (PASSAT/SocialCom 2011), P838, DOI 10.1109/PASSAT/SocialCom.2011.120
[5]  
Cheng Z., 2010, PROC 19 ACM INT C IN, P759, DOI DOI 10.1145/1871437.1871535
[6]  
Dalvi Nilesh., 2012, P WSDM, P43
[7]  
Hecht B, 2011, 29TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, P237
[8]  
Kelm P., 2011, ACM INT C MULT RETR, P52
[9]  
Kinsella S., 2011, P 3 INT WORKSH SEARC, P61, DOI DOI 10.1145/2065023.2065039
[10]  
Larson M., 2011, P 1 ACM INT C MULT R, P51