Analyzing spatiotemporal trends in social media data via smoothing spline analysis of variance

被引:13
作者
Helwig, Nathaniel E. [1 ,2 ]
Gao, Yizhao [3 ]
Wang, Shaowen [3 ,4 ]
Ma, Ping [5 ]
机构
[1] Univ Minnesota, Dept Psychol, Minneapolis, MN 55455 USA
[2] Univ Minnesota, Sch Stat, Minneapolis, MN 55455 USA
[3] Univ Illinois, Dept Geog & Geog Informat Sci, Champaign, IL 61820 USA
[4] Univ Illinois, Natl Ctr Supercomp Applicat, Urbana, IL 61801 USA
[5] Univ Georgia, Dept Stat, Athens, GA 30602 USA
基金
美国国家科学基金会;
关键词
Smoothing spline; Social media; Spatial smoothing; Spatiotemporal smoothing; BAYESIAN CONFIDENCE-INTERVALS; COMPUTATION; REGRESSION; MODELS;
D O I
10.1016/j.spasta.2015.09.002
中图分类号
P [天文学、地球科学];
学科分类号
07 ;
摘要
Social media have become an integral part of life for many individuals, and social media websites generate incredible amounts of data on a variety of societal topics. Furthermore, some social media posts contain geolocation information, so social media data can be viewed as a spatiotemporal phenomenon. To understand spatiotemporal trends in ultra-large sample social media data, we propose a novel application of the Smoothing Spline Analysis of Variance (SSANOVA) framework, which is a nonparametric approach capable of discovering latent functional relationships in noisy data. Unlike currently available approaches, our proposed SSANOVA framework (a) makes few assumptions about the nature of the spatiotemporal trend, (b) provides a mean of assessing the uncertainty of the estimated spatiotemporal trend, and (c) is scalable to analyze massive samples of social media data. To demonstrate the potential of our approach, we model the daily spatiotemporal Twitter trend in the United States. Our results reveal that the proposed SSANOVA approach can provide accurate and informative estimates of spatiotemporal social media trends, as well as useful information about the precision of the estimated spatiotemporal trends. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:491 / 504
页数:14
相关论文
共 52 条
[1]  
Achrekar H., 2011, IEEE INFOCOM 2011 - IEEE Conference on Computer Communications. Workshops, P702, DOI 10.1109/INFCOMW.2011.5928903
[2]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[3]  
[Anonymous], 2013, Smoothing Spline ANOVA Models
[4]  
[Anonymous], 2014, e1071: Misc Functions of the Department of Statistics (e1071)
[5]  
Asur S., 2010, Proceedings 2010 IEEE/ACM International Conference on Web Intelligence-Intelligent Agent Technology (WI-IAT), P492, DOI 10.1109/WI-IAT.2010.63
[6]   Twitter mood predicts the stock market [J].
Bollen, Johan ;
Mao, Huina ;
Zeng, Xiaojun .
JOURNAL OF COMPUTATIONAL SCIENCE, 2011, 2 (01) :1-8
[7]  
Brownrigg R., 2013, maps: Draw Geographical Maps. R package version 2.3-6
[8]   Event Detection using Twitter: A Spatio-Temporal Approach [J].
Cheng, Tao ;
Wicks, Thomas .
PLOS ONE, 2014, 9 (06)
[9]  
Cho E., 2011, P 17 ACM SIGKDD INT, P1082, DOI [DOI 10.1145/2020408.2020579, 10.1145/2020408.2020579]
[10]  
Corley Courtney D., 2009, Proceedings of the 2009 International Conference on Bioinformatics & Computational Biology. BIOCOMP 2009, P340