Leveraging fine-grained mobile data for churn detection through Essence Random Forest

被引:5
作者
Colot, Christian [1 ]
Baecke, Philippe [2 ]
Linden, Isabelle [1 ]
机构
[1] Univ Namur, Dept Business Adm, Namur, Belgium
[2] Vlerick Business Sch, Ghent, Belgium
关键词
Telecom data; Random Forest; Customer churn; Customer analytics; Unstructured data; Probability models; HIGH-DIMENSIONAL DATA; MAXIMUM RELEVANCE; SOCIAL-INFLUENCE; PREDICTION; SELECTION; CLASSIFICATION; ALGORITHMS; NETWORKS; ENSEMBLE; OTT;
D O I
10.1186/s40537-021-00451-9
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The rise of unstructured data leads to unprecedented opportunities for marketing applications along with new methodological challenges to leverage such data. In particular, redundancy among the features extracted from this data deserves special attention as it might prevent current methods to benefit from it. In this study, we propose to investigate the value of multiple fine-grained data sources i.e. websurfing, use of applications and geospatial mobility for churn detection within telephone companies. This value is analysed both in substitution and in complement to the value of the well-known communication network. What is more, we also suggest an adaptation of the Random Forest algorithm called Essence Random Forest designed to better address redundancy among extracted features. Analysing fine-grained data of a telephone company, we first find that geo-spatial mobility data might be a good long term alternative to the classical communication network that might become obsolete due to the competition with digital communications. Then, we show that, on the short term, these alternative fine-grained data might complement the communication network for an improved churn detection. In addition, compared to Random Forest and Extremely Randomized Trees, Essence Random Forest better leverages the value of unstructured data by offering an enhanced churn detection regardless of the addressed perspective i.e. substitution or complement. Finally, Essence Random Forest converges faster to stable results which is a salient property in a resource constrained environment.
引用
收藏
页数:26
相关论文
共 59 条
  • [1] Increasing diversity in random forest learning algorithm via imprecise probabilities
    Abellan, Joaquin
    Mantas, Carlos J.
    Castellano, Javier G.
    Moral-Garcia, SerafIn
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2018, 97 : 228 - 243
  • [2] Customer churn prediction in telecom using machine learning in big data platform
    Ahmad, Abdelrahim Kasem
    Jafar, Assef
    Aljoumaa, Kadan
    [J]. JOURNAL OF BIG DATA, 2019, 6 (01)
  • [3] Social network analysis in Telecom data
    Al-Molhem, Nour Raeef
    Rahal, Yasser
    Dakkak, Mustapha
    [J]. JOURNAL OF BIG DATA, 2019, 6 (01)
  • [4] Predicting customer's gender and age depending on mobile phone data
    Al-Zuabi, Ibrahim Mousa
    Jafar, Assef
    Aljoumaa, Kadan
    [J]. JOURNAL OF BIG DATA, 2019, 6 (01)
  • [5] Enriched random forests
    Amaratunga, Dhammika
    Cabrera, Javier
    Lee, Yung-Seop
    [J]. BIOINFORMATICS, 2008, 24 (18) : 2010 - 2014
  • [6] In Pursuit of Enhanced Customer Retention Management: Review, Key Issues, and Future Directions
    Eva Ascarza
    Scott A. Neslin
    Oded Netzer
    Zachery Anderson
    Peter S. Fader
    Sunil Gupta
    Bruce G. S. Hardie
    Aurélie Lemmens
    Barak Libai
    David Neal
    Foster Provost
    Rom Schrift
    [J]. Customer Needs and Solutions, 2018, 5 (1-2) : 65 - 81
  • [7] Bernstein MN, NOTE RANDOM FORESTS
  • [8] Random forests: Finding quasars
    Breiman, L
    Last, M
    Rice, J
    [J]. STATISTICAL CHALLENGES IN ASTRONOMY, 2003, : 243 - 254
  • [9] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [10] Carreira-Perpinan Miguel A., 2020, FODS '20: Proceedings of the 2020 Foundations of Data Science Conference, P35, DOI 10.1145/3412815.3416882