Logistic Regression Ensemble for Predicting Customer Defection with Very Large Sample Size

被引:8
|
作者
Kuswanto, Heri [1 ]
Asfihani, Ayu [1 ]
Sarumaha, Yogi [1 ]
Ohwada, Hayato [2 ]
机构
[1] Inst Teknol Sepuluh Nopember, Dept Stat, Kampus ITS Sukolilo, Surabaya 60111, Indonesia
[2] Tokyo Univ Sci, Grad Sch Sci & Technol, Dept Ind Adm, Noda, Chiba 278, Japan
来源
THIRD INFORMATION SYSTEMS INTERNATIONAL CONFERENCE 2015 | 2015年 / 72卷
关键词
ensemble; logistic regression; classification; high dimensional data; machine learning;
D O I
10.1016/j.procs.2015.12.108
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Predicting customer defection is an important subject for companies producing cloud based software. The studied company sell three products (High, Medium and Low Price), in which the consumer has choice to defect or retain the product after certain period of time. The fact that the company collected very large dataset leads to inapplicability of standard statistical models due to the curse of dimensionality. Parametric statistical models will tend to produce very big standard error which may lead to inaccurate prediction results. This research examines a machine learning approach developed for high dimensional data namely logistic regression ensemble (LORENS). Using computational approaches, LORENS has prediction ability as good as standard logistic regression model i. e. between 66% to 77% prediction accuracy. In this case, LORENS is preferable as it is more reliable and free of assumptions. (C) 2015 The Authors. Published by Elsevier B.V.
引用
收藏
页码:86 / 93
页数:8
相关论文
共 27 条
  • [1] Sample size determination in logistic regression
    Alam, M. Khorshed
    Rao, M. Bhaskara
    Cheng, Fu-Chih
    SANKHYA-SERIES B-APPLIED AND INTERDISCIPLINARY STATISTICS, 2010, 72 (01): : 58 - 75
  • [2] Sample size determination in logistic regression
    Khorshed Alam M.
    Bhaskara Rao M.
    Cheng F.-C.
    Sankhya B, 2010, 72 (1) : 58 - 75
  • [3] Sample size determination for logistic regression
    Motrenko, Anastasiya
    Strijov, Vadim
    Weber, Gerhard-Wilhelm
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2014, 255 : 743 - 752
  • [4] Calculating sample size bounds for logistic regression
    Broll, S
    Glaser, S
    Kreienbrock, L
    PREVENTIVE VETERINARY MEDICINE, 2002, 54 (02) : 105 - 111
  • [5] Predicting Customer's Satisfaction (Dissatisfaction) Using Logistic Regression
    Anand, Adarsh
    Bansal, Gunjan
    INTERNATIONAL JOURNAL OF MATHEMATICAL ENGINEERING AND MANAGEMENT SCIENCES, 2016, 1 (02) : 77 - 88
  • [6] Optimal Subsampling for Large Sample Logistic Regression
    Wang, HaiYing
    Zhu, Rong
    Ma, Ping
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2018, 113 (522) : 829 - 844
  • [7] Sample size calculations for logistic and Poisson regression models
    Shieh, G
    BIOMETRIKA, 2001, 88 (04) : 1193 - 1199
  • [8] A simple approach to power and sample size calculations in logistic regression and Cox regression models
    Væth, M
    Skovlund, E
    STATISTICS IN MEDICINE, 2004, 23 (11) : 1781 - 1792
  • [9] Sample size determination for logistic regression on a logit-normal distribution
    Kim, Seongho
    Heath, Elisabeth
    Heilbrun, Lance
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2017, 26 (03) : 1237 - 1247
  • [10] Large sample convergence diagnostics for likelihood based inference: Logistic regression
    Brimacombe, Michael
    STATISTICAL METHODOLOGY, 2016, 33 : 114 - 130