An empirical comparison of techniques for the class imbalance problem in churn prediction

被引:106
作者
Zhu, Bing [1 ,2 ]
Baesens, Bart [2 ,3 ]
vanden Broucke, Seppe K. L. M. [2 ]
机构
[1] Sichuan Univ, Business Sch, Chengdu 610064, Peoples R China
[2] Katholieke Univ Leuven, Dept Decis Sci & Informat Management, B-3000 Leuven, Belgium
[3] Univ Southampton, Sch Management, Southampton SO17 1BJ, Hants, England
基金
中国国家自然科学基金;
关键词
Churn prediction; Class imbalance; Benchmark experiment; Expected maximum profit measure; CUSTOMER CHURN; CLASSIFICATION; FRAMEWORK; MACHINE;
D O I
10.1016/j.ins.2017.04.015
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Class imbalance brings significant challenges to customer churn prediction. Many solutions have been developed to address this issue. In this paper, we comprehensively compare the performance of state-of-the-art techniques to deal with class imbalance in the context of churn prediction. A recently developed expected maximum profit criterion is used as one of the main performance measures to offer more insights from the perspective of cost-benefit. The experimental results show that the applied evaluation metric has a great impact on the performance of techniques. An in-depth exploration of reaction patterns to different measures is conducted by intra-family comparison within each solution group and global comparison among the representative techniques from different groups. The results also indicate there is much space to improve solutions' performance in terms of profit-based measure. Our study offers valuable insights for academics and professionals and it also provides a baseline to develop new methods for dealing with class imbalance in churn prediction. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:84 / 99
页数:16
相关论文
共 46 条
[1]   Dynamic churn prediction framework with more effective use of rare event data: The case of private banking [J].
Ali, Ozden Gur ;
Ariturk, Umut .
EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (17) :7889-7903
[2]  
[Anonymous], 2007, ICML, DOI DOI 10.1145/1273496.1273614
[3]  
[Anonymous], 1997, P 14 INT C ONMACHINE
[4]   MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning [J].
Barua, Sukarna ;
Islam, Md. Monirul ;
Yao, Xin ;
Murase, Kazuyuki .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) :405-425
[5]  
Batista GE., 2004, ACM SIGKDD EXPL NEWS, V6, P20, DOI DOI 10.1145/1007730.1007735
[6]   Handling class imbalance in customer churn prediction [J].
Burez, J. ;
Van den Poel, D. .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) :4626-4636
[7]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[8]   SMOTEBoost: Improving prediction of the minority class in boosting [J].
Chawla, NV ;
Lazarevic, A ;
Hall, LO ;
Bowyer, KW .
KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2003, PROCEEDINGS, 2003, 2838 :107-119
[9]  
Chen C., 2004, TECHNICAL REPORT
[10]   A hierarchical multiple kernel support vector machine for customer churn prediction using longitudinal behavioral data [J].
Chen, Zhen-Yu ;
Fan, Zhi-Ping ;
Sun, Minghe .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2012, 223 (02) :461-472