Activity-based Twitter sampling for content-based and user-centric prediction models

被引:12
作者
Aghababaei, Somayyeh [1 ]
Makrehchi, Masoud [1 ]
机构
[1] Univ Ontario Inst Technol UOIT, Dept Elect Comp & Software Engn, 2000 Simcoe St N, Oshawa, ON L1H 7K4, Canada
关键词
Twitter sampling; Temporal prediction models; Historical timelines; User activity; Activity-based sampling;
D O I
10.1186/s13673-016-0084-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Increasingly more applications rely on crowd-sourced data from social media. Some of these applications are concerned with real-time data streams, while others are more focused on acquiring temporal footprints from historical data. Nevertheless, determining the subset of "credible" users is crucial. While the majority of sampling approaches focus on individual static networks, dynamic user activity over time is usually not considered, which may result in activity gaps in the collected data. Models based on noisy and missing data can significantly degrade in performance. In this study, we demonstrate how to sample Twitter users in order to produce more credible data for temporal prediction models. We present an activity-based sampling approach where users are selected based on their historical activities in Twitter. The predictability of the collected content from activity-based and random sampling is compared in a content-based and user-centric temporal model. The results indicate the importance of an activity-oriented sampling method for the acquisition of more credible content for temporal models.
引用
收藏
页数:20
相关论文
共 36 条
[1]  
Achrekar Harshavardhan, 2012, Proceedings of the International Conference on Health Informatics. HEALTHINF 2012, P61
[2]   Temporal Topic Inference for Trend Prediction [J].
Aghababaei, Somayyeh ;
Makrehchi, Masoud .
2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2015, :877-884
[3]  
[Anonymous], 2006, KDD
[4]  
[Anonymous], 2011, P INT AAAI C WEB SOC
[5]  
[Anonymous], 2011, J COMPUT SCI-NETH, DOI DOI 10.1016/j.jocs.2010.12.007
[6]  
Bosnjak M, 2012, INT WORLD WID WEB C
[7]  
Chepurna Iuliia, 2015, Social Computing, Behavioral-Cultural Modeling and Prediction. 8th International Conference, SBP 2015. Proceedings: LNCS 9021, P270, DOI 10.1007/978-3-319-16268-3_29
[8]  
Fan RE, 2008, J MACH LEARN RES, V9, P1871
[9]  
Gaurav M., 2013, Proceedings of the 7th workshop on social network mining and analysis, P7
[10]   Predicting crime using Twitter and kernel density estimation [J].
Gerber, Matthew S. .
DECISION SUPPORT SYSTEMS, 2014, 61 :115-125