Churn prediction of mobile and online casual games using play log data

被引：39

作者：

Kim, Seungwook ^{[1
]}

Choi, Daeyoung ^{[1
]}

Lee, Eunjung ^{[1
]}

Rhee, Wonjong ^{[1
]}

机构：

[1] Seoul Natl Univ, Dept Transdisciplinary Studies, Seoul, South Korea

来源：

PLOS ONE | 2017年 / 12卷 / 07期

基金：

新加坡国家研究基金会;

关键词：

AREA;

D O I：

10.1371/journal.pone.0180735

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Internet-connected devices, especially mobile devices such as smartphones, have become widely accessible in the past decade. Interaction with such devices has evolved into frequent and short-duration usage, and this phenomenon has resulted in a pervasive popularity of casual games in the game sector. On the other hand, development of casual games has become easier than ever as a result of the advancement of development tools. With the resulting fierce competition, now both acquisition and retention of users are the prime concerns in the field. In this study, we focus on churn prediction of mobile and online casual games. While churn prediction and analysis can provide important insights and action cues on retention, its application using play log data has been primitive or very limited in the casual game area. Most of the existing methods cannot be applied to casual games because casual game players tend to churn very quickly and they do not pay periodic subscription fees. Therefore, we focus on the new players and formally define churn using observation period (OP) and churn prediction period (CP). Using the definition, we develop a standard churn analysis process for casual games. We cover essential topics such as preprocessing of raw data, feature engineering including feature analysis, churn prediction modeling using traditional machine learning algorithms (logistic regression, gradient boosting, and random forests) and two deep learning algorithms (CNN and LSTM), and sensitivity analysis for OP and CP. Play log data of three different casual games are considered by analyzing a total of 193,443 unique player records and 10,874,958 play log records. While the analysis results provide useful insights, the overall results indicate that a small number of well-chosen features used as performance metrics might be sufficient for making important action decisions and that OP and CP should be properly chosen depending on the analysis goal.

引用

页数：19

共 42 条

[1]

[Anonymous], 2009, P 13 INT MINDTREK C

[2]

[Anonymous], 2015, Deep Learn Nat, DOI [10.1038/nature14539, DOI 10.1038/NATURE14539]

[3]

[Anonymous], P 33 ANN ACM C HUM F

[4]

[Anonymous], 2015, XGBOOST EXTREME GRAD

[5]

[Anonymous], 1997, Neural Comput., V9, P1735

[6]

[Anonymous], IEEE CONF COMPU INTE

[7]

[Anonymous], ENCY STAT SCI

[8]

[Anonymous], ESS FACTS COMP VID G

[9]

[Anonymous], 2016, ARXIV160405377

[10]

[Anonymous], EUR C PRINC DAT MIN

← 1 2 3 4 5 →