Cost-Effective User Monitoring for Popularity Prediction of Online User-Generated Content

被引:7
作者
Yang, Mengmeng [1 ]
Chen, Kai [1 ]
Miao, Zhongchen [1 ]
Yang, Xiaokang [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai 200030, Peoples R China
来源
2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW) | 2014年
关键词
popularity prediction; online user-generated content; cost-effective; user selection;
D O I
10.1109/ICDMW.2014.72
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we study on the popularity prediction of online user-generated contents, where high quality predictions give us much more flexibility and preparing time in deploying limited resources (such as advertising budget, monitoring capacity) into more popular contents. However the high retrieval cost of data used in prediction is a big challenge due to the large amount of users and contents involved. We propose a notion that higher popularity user-generated contents can be predicted by concentrating on fewer but informative users, as we notice the fact that contents generated by those users tend to become popular while that which are generated by the rest users do not. We develop a cost-effective popularity prediction framework to fulfil online prediction. It contains 3 modules: (a) online data retrieving, (b) informative users selection and (c) popularity prediction. A hybrid user selection algorithm and several popularity prediction algorithms/improvements are presented, and their performance are evaluated and compared using (a) the selected users' generated data and (b) all users' generated data, retrieved from Sina Weibo Microblogger. The best prediction algorithm reaches a 78% accuracy at the time of 24 hours after publishing time when level width N-l equals 500. And the best combination of prediction and selection algorithms performs only about 7% worse on dataset
引用
收藏
页码:944 / 951
页数:8
相关论文
共 20 条
[1]  
Ahmed M., 2013, P 6 ACM INT C WEB SE, P607, DOI [10.1145/2433396.2433473, DOI 10.1145/2433396.2433473]
[2]  
[Anonymous], Proceedings of the fifth ACMinternational conference on Web search and data mining, DOI [DOI 10.1145/2124295.2124320, 10.1145/2124295.2124320]
[3]  
[Anonymous], 2011, P 20 ACM INT C INF K, DOI DOI 10.1145/2063576.2063915
[4]  
[Anonymous], 2010, Proceedings of the 19th International Conference on World Wide Web, WWW'10, page, DOI DOI 10.1145/1772690.1772754
[5]  
[Anonymous], 2011, ICWSM
[6]   Serglycin-deficient cytotoxic T lymphocytes display defective secretory granule maturation and granzyme B storage [J].
Grujic, M ;
Braga, T ;
Lukinius, A ;
Eloranta, ML ;
Knight, SD ;
Pejler, G ;
Åbrink, M .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2005, 280 (39) :33411-33418
[7]  
Applegate D., 2010, P 6 INT C, P4
[8]   A model of Internet topology using k-shell decomposition [J].
Carmi, Shai ;
Havlin, Shlomo ;
Kirkpatrick, Scott ;
Shavitt, Yuval ;
Shir, Eran .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (27) :11150-11154
[9]  
Cha M, 2007, IMC'07: PROCEEDINGS OF THE 2007 ACM SIGCOMM INTERNET MEASUREMENT CONFERENCE, P1
[10]  
Chen K, 2013, PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'13 COMPANION), P107