A Churn Prediction Model Using Random Forest: Analysis of Machine Learning Techniques for Churn Prediction and Factor Identification in Telecom Sector

被引:110
作者
Ullah, Irfan [1 ]
Raza, Basit [1 ]
Malik, Ahmad Kamran [1 ]
Imran, Muhammad [1 ]
Ul Islam, Saif [2 ]
Kim, Sung Won [3 ]
机构
[1] CUI, Dept Comp Sci, Islamabad 45550, Pakistan
[2] Dr AQ Khan Inst Comp Sci & Informat Technol, Dept Comp Sci, Rawalpindi 47320, Pakistan
[3] Yeungnam Univ, Dept Informat & Commun Engn, Gyongsan 38542, South Korea
基金
新加坡国家研究基金会;
关键词
Churn prediction; retention; telecom; CRM; machine learning; CLASS IMBALANCE PROBLEM; CUSTOMER CHURN; TELECOMMUNICATION; SYSTEM;
D O I
10.1109/ACCESS.2019.2914999
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the telecom sector, a huge volume of data is being generated on a daily basis due to a vast client base. Decision makers and business analysts emphasized that attaining new customers is costlier than retaining the existing ones. Business analysts and customer relationship management (CRM) analyzers need to know the reasons for churn customers, as well as, behavior patterns from the existing churn customers' data. This paper proposes a churn prediction model that uses classification, as well as, clustering techniques to identify the churn customers and provides the factors behind the churning of customers in the telecom sector. Feature selection is performed by using information gain and correlation attribute ranking filter. The proposed model first classifies churn customers data using classification algorithms, in which the Random Forest (RF) algorithm performed well with 88.63% correctly classified instances. Creating effective retention policies is an essential task of the CRM to prevent churners. After classification, the proposed model segments the churning customer's data by categorizing the churn customers in groups using cosine similarity to provide group-based retention offers. This paper also identified churn factors that are essential in determining the root causes of churn. By knowing the significant churn factors from customers' data, CRM can improve productivity, recommend relevant promotions to the group of likely churn customers based on similar behavior patterns, and excessively improve marketing campaigns of the company. The proposed churn prediction model is evaluated using metrics, such as accuracy, precision, recall, f-measure, and receiving operating characteristics (ROC) area. The results reveal that our proposed churn prediction model produced better churn classification using the RF algorithm and customer profiling using k-means clustering. Furthermore, it also provides factors behind the churning of churn customers through the rules generated by using the attribute-selected classifier algorithm.
引用
收藏
页码:60134 / 60149
页数:16
相关论文
共 42 条
[1]  
Ahmed M, 2017, ADV DATA SCI ADAPT, V9, DOI 10.1142/S2424922X17500073
[2]   Dynamic churn prediction framework with more effective use of rare event data: The case of private banking [J].
Ali, Ozden Gur ;
Ariturk, Umut .
EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (17) :7889-7903
[3]   Cross-company customer churn prediction in telecommunication: A comparison of data transformation methods [J].
Amin, Adnan ;
Shah, Babar ;
Khattak, Asad Masood ;
Lopes Moreira, Fernando Joaquim ;
Ali, Gohar ;
Rocha, Alvaro ;
Anwar, Sajid .
INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 2019, 46 :304-319
[4]   Customer churn prediction in telecommunication industry using data certainty [J].
Amin, Adnan ;
Al-Obeidat, Feras ;
Shah, Babar ;
Adnan, Awais ;
Loo, Jonathan ;
Anwar, Sajid .
JOURNAL OF BUSINESS RESEARCH, 2019, 94 :290-301
[5]   Customer churn prediction in the telecommunication sector using a rough set approach [J].
Amin, Adnan ;
Anwar, Sajid ;
Adnan, Awais ;
Nawaz, Muhammad ;
Alawfi, Khalid ;
Hussain, Amir ;
Huang, Kaizhu .
NEUROCOMPUTING, 2017, 237 :242-254
[6]   Comparing Oversampling Techniques to Handle the Class Imbalance Problem: A Customer Churn Prediction Case Study [J].
Amin, Adnan ;
Anwar, Sajid ;
Adnan, Awais ;
Nawaz, Muhammad ;
Howard, Newton ;
Qadir, Junaid ;
Hawalah, Ahmad ;
Hussain, Amir .
IEEE ACCESS, 2016, 4 :7940-7957
[7]  
[Anonymous], 2018, 2018 IEEE C EVOLUTIO
[8]  
Babu S., 2014, International Journal of Engineering Research Technology, V3, P1745
[9]   The Architecture of a Churn Prediction System Based on Stream Mining [J].
Balle, Borja ;
Casas, Bernardino ;
Catarineu, Alex ;
Gavalda, Ricard ;
Manzano-Macho, David .
ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE OF THE CATALAN ASSOCIATION FOR ARTIFICIAL INTELLIGENCE, 2013, 256 :157-166
[10]   A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees [J].
De Caigny, Arno ;
Coussement, Kristof ;
De Bock, Koen W. .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2018, 269 (02) :760-772