A Big Data-Driven Hybrid Model for Enhancing Streaming Service Customer Retention Through Churn Prediction Integrated With Explainable AI

被引:2
作者
Gani Joy, Usman [1 ]
Hoque, Kazi Ekramul [1 ]
Nazim Uddin, Mohammed [1 ]
Chowdhury, Linkon [1 ]
Park, Seung-Bo [2 ]
机构
[1] East Delta Univ, Sch Sci Engn & Technol, Chattagram 4209, Bangladesh
[2] Inha Univ, Dept Software Convergence Engn, Incheon 22212, South Korea
来源
IEEE ACCESS | 2024年 / 12卷
基金
新加坡国家研究基金会;
关键词
Artificial intelligence; classification algorithms; deep learning; decision support systems; explainable AI; model interpretation; semi-supervised learning; big data analysis; NEURAL-NETWORKS; SELECTION;
D O I
10.1109/ACCESS.2024.3401247
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Customer churn prediction is a critical issue that streaming services face as retaining existing subscribers is vital to the success of the business. Creating reliable churn prediction models is important because the costs of acquiring new customers are usually higher than those involved in retaining existing ones. In this study, we propose a big data-driven hybrid model combining a deep neural network with a machine-learning model to efficiently forecast customer churn. Our proposed model uses Long Short-Term Memory (LSTM) with a Gated Recurrent Unit (GRU) to capture the trends in subscribers' usage patterns over time. In addition, light gradient boosting (Light GBM) is used to leverage insights from sequential modeling along with original attributes to forecast churn. Moreover, feature selection techniques like Chi-squared testing and Sequential Feature Selection (SFS) are utilized to choose the optimum set of features for our proposed model. Furthermore, several individual models, including deep learning and traditional machine learning algorithms are also evaluated and compared with our proposed hybrid model. Additionally, the study illustrates model interpretations using Shapley Additive Explanations (SHAP) and Explainable Boosting Machine (EBM) which are used for identifying influential features in streaming services enhancing customer retention efforts. These techniques provide transparency into our proposed model's forecasting, making them more actionable and understandable for decision-makers. Extensive experimental evaluation demonstrates the hybrid model achieves best-in-class performance with 95.60% AUC and 90.09% F1 score.
引用
收藏
页码:69130 / 69150
页数:21
相关论文
共 67 条
  • [11] Caigny De, Incorporating textual information in customer churn prediction modelsbased on a convolutional neural network
  • [12] RANDOMIZED OVERSAMPLING FOR GENERALIZED MULTISCALE FINITE ELEMENT METHODS
    Calo, Victor M.
    Efendiev, Yalchin
    Galvis, Juan
    Li, Guanglian
    [J]. MULTISCALE MODELING & SIMULATION, 2016, 14 (01) : 482 - 501
  • [13] SMOTE: Synthetic minority over-sampling technique
    Chawla, Nitesh V.
    Bowyer, Kevin W.
    Hall, Lawrence O.
    Kegelmeyer, W. Philip
    [J]. 2002, American Association for Artificial Intelligence (16)
  • [14] XGBoost: A Scalable Tree Boosting System
    Chen, Tianqi
    Guestrin, Carlos
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 785 - 794
  • [15] Leveraging fine-grained mobile data for churn detection through Essence Random Forest
    Colot, Christian
    Baecke, Philippe
    Linden, Isabelle
    [J]. JOURNAL OF BIG DATA, 2021, 8 (01)
  • [16] Dey R, 2017, MIDWEST SYMP CIRCUIT, P1597, DOI 10.1109/MWSCAS.2017.8053243
  • [17] Sensitivity Analysis of k-Fold Cross Validation in Prediction Error Estimation
    Diego Rodriguez, Juan
    Perez, Aritz
    Antonio Lozano, Jose
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (03) : 569 - 575
  • [18] Dong Q, 2018, Arxiv, DOI arXiv:1804.10851
  • [19] Famili A., 1997, Intelligent Data Analysis, V1
  • [20] Gallo A., 2014, Harvard Bus. Rev.Oct.