Popularity prediction of movies: from statistical modeling to machine learning techniques

被引:0
作者
Syed Muhammad Raza Abidi
Yonglin Xu
Jianyue Ni
Xiangmeng Wang
Wu Zhang
机构
[1] Shanghai University,School of Computer Engineering and Science
[2] Shanghai University,Shanghai Institute of Applied Mathematics and Mechanics
来源
Multimedia Tools and Applications | 2020年 / 79卷
关键词
Movie popularity; Machine learning; Movie success; Regression; IMDb; Supervised learning;
D O I
暂无
中图分类号
学科分类号
摘要
Film industries all over the world are producing several hundred movies rapidly and grabbing the attraction of people of all ages. Every movie producer is of keen interest in knowing which movies are either likely to hit or flop in the box office. So, the early prediction of the popularity of a movie is of the utmost importance to the film industry. In this study, we examine factors inside the hidden patterns which become movie popular. In past studies, machine learning techniques were implemented on blog articles, social networking, and social media to predict the success of a movie. Their works focused on which algorithms are better at predicting the success of a movie but less focused on data and attributes related to an ongoing movie and in various directions. In this paper, we inspect this perspective that might be related to the prediction of the results. Data collected from the publicly available Internet Movie Database (IMDb). We implemented five machine learning algorithms, i.e., Generalized Linear Model (GLM), Deep Learning (DL), Decision Tree (DT), Random Forest (RF), and Gradient Boosted Tree (GBT) using Root Mean Squared Error (RMSE) as a performance metric and got the accuracy performances of GLM: 47.9%, DL: 51.1%, DT: 54.5%, RF: 50.0%, and GBT: 49.5%, respectively. We found that GLM is the high achieving accuracy regression classifier due to the lower value of RMSE, which is considered to be better.
引用
收藏
页码:35583 / 35617
页数:34
相关论文
共 76 条
  • [1] Asad KI(2012)Movie popularity classification based on inherent movie attributes using C4.5, PART and correlation coefficient. 2012 Int Conf informatics Electron Vision, ICIEV 2012 747-752
  • [2] Ahmed T(2014)Predicting movie success based on IMDB data Int J Data Min Tech Appl Integr Intell Res 03 365-368
  • [3] Saiedur Rahman M(2003)How critical are critical reviews? The box office effects of film critics, star power, and budgets J Mark 67 103-117
  • [4] Babu SP(1998)Learning collaborative information filters Proc Fifteenth Int Conf Mach Learn 54 48-32
  • [5] Basuroy S(2001)Random forests Mach Learn 45 5-93
  • [6] Chatterjee S(2017)Predicting attrition from massive open online courses in FutureLearn and edX CEUR Workshop Proc 1967 74-1689
  • [7] Ravid SA(2014)Box office prediction based on microblog Expert Syst Appl 41 1680-120
  • [8] Billsus D(2008)The power of stars: do star actors drive the success of movies? J Mark 71 102-893
  • [9] Pazzani MJ(2007)From story line to box office: a new approach for green-lighting movie scripts Manag Sci 53 881-12
  • [10] Breiman L(2011)Performance prediction of engineering students using decision trees Int J Comput Appl 36 8-4102