Predicting Self-declared Movie Watching Behavior Using Facebook Data and Information-Fusion Sensitivity Analysis

被引:6
作者
Bogaert, Matthias [1 ,2 ]
Ballings, Michel [3 ]
Bergmans, Rob [2 ]
Van den Poel, Dirk [2 ]
机构
[1] Univ Edinburgh, Business Sch, 29 Buccleuch Pl, Edinburgh EH8 9JS, Midlothian, Scotland
[2] Univ Ghent, Dept Mkt, Tweekerkenstr 2, B-9000 Ghent, Belgium
[3] Univ Tennessee, Haslam Coll Business, Dept Business Analyt & Stat, 916 Volunteer Blvd,249 Stokely Management Ctr, Knoxville, TN 37996 USA
关键词
Facebook; information-fusion; machine learning; movies; predictive models; social media; DATA ANALYTIC APPROACH; BOX-OFFICE; CHURN PREDICTION; ROTATION FOREST; REVIEWS; MODELS; CLASSIFIERS; SELECTION; TWEETS; AREA;
D O I
10.1111/deci.12406
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
The main purpose of this paper is to evaluate the feasibility of predicting whether yes or no a Facebook user has self-reported to have watched a given movie genre. Therefore, we apply a data analytical framework that (1) builds and evaluates several predictive models explaining self-declared movie watching behavior, and (2) provides insight into the importance of the predictors and their relationship with self-reported movie watching behavior. For the first outcome, we benchmark several algorithms (logistic regression, random forest, adaptive boosting, rotation forest, and naive Bayes) and evaluate their performance using the area under the receiver operating characteristic curve. For the second outcome, we evaluate variable importance and build partial dependence plots using information-fusion sensitivity analysis for different movie genres. To gather the data, we developed a custom native Facebook app. We resampled our dataset to make it representative of the general Facebook population with respect to age and gender. The results indicate that adaptive boosting outperforms all other algorithms. Time- and frequency-based variables related to media (movies, videos, and music) consumption constitute the list of top variables. To the best of our knowledge, this study is the first to fit predictive models of self-reported movie watching behavior and provide insights into the relationships that govern these models. Our models can be used as a decision tool for movie producers to target potential movie-watchers and market their movies more efficiently.
引用
收藏
页码:776 / 810
页数:35
相关论文
共 90 条
  • [1] [Anonymous], 2012, INT J COMPUT APPL, DOI DOI 10.5120/8852-2794
  • [2] [Anonymous], ANN OPER RES
  • [3] [Anonymous], 2014, Forbes
  • [4] [Anonymous], 2006, AAAI Symposium on Computational Approaches to Analysing Weblogs (AAAI-CAAW)
  • [5] [Anonymous], 2007, Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval
  • [6] Apala KR, 2013, 2013 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), P1209
  • [7] Forecasting with Twitter Data
    Arias, Marta
    Arratia, Argimiro
    Xuriguera, Ramon
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2013, 5 (01)
  • [8] Asur S., 2010, Proceedings 2010 IEEE/ACM International Conference on Web Intelligence-Intelligent Agent Technology (WI-IAT), P492, DOI 10.1109/WI-IAT.2010.63
  • [9] What you see may not be what you get: A brief, nontechnical introduction to overfitting in regression-type models
    Babyak, MA
    [J]. PSYCHOSOMATIC MEDICINE, 2004, 66 (03): : 411 - 421
  • [10] Social media optimization: Identifying an optimal strategy for increasing network size on Facebook
    Ballings, Michel
    Van den Poel, Dirk
    Bogaert, Matthias
    [J]. OMEGA-INTERNATIONAL JOURNAL OF MANAGEMENT SCIENCE, 2016, 59 : 15 - 25