Logistic Regression Ensemble for Predicting Customer Defection with Very Large Sample Size

被引:8
|
作者
Kuswanto, Heri [1 ]
Asfihani, Ayu [1 ]
Sarumaha, Yogi [1 ]
Ohwada, Hayato [2 ]
机构
[1] Inst Teknol Sepuluh Nopember, Dept Stat, Kampus ITS Sukolilo, Surabaya 60111, Indonesia
[2] Tokyo Univ Sci, Grad Sch Sci & Technol, Dept Ind Adm, Noda, Chiba 278, Japan
来源
THIRD INFORMATION SYSTEMS INTERNATIONAL CONFERENCE 2015 | 2015年 / 72卷
关键词
ensemble; logistic regression; classification; high dimensional data; machine learning;
D O I
10.1016/j.procs.2015.12.108
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Predicting customer defection is an important subject for companies producing cloud based software. The studied company sell three products (High, Medium and Low Price), in which the consumer has choice to defect or retain the product after certain period of time. The fact that the company collected very large dataset leads to inapplicability of standard statistical models due to the curse of dimensionality. Parametric statistical models will tend to produce very big standard error which may lead to inaccurate prediction results. This research examines a machine learning approach developed for high dimensional data namely logistic regression ensemble (LORENS). Using computational approaches, LORENS has prediction ability as good as standard logistic regression model i. e. between 66% to 77% prediction accuracy. In this case, LORENS is preferable as it is more reliable and free of assumptions. (C) 2015 The Authors. Published by Elsevier B.V.
引用
收藏
页码:86 / 93
页数:8
相关论文
共 27 条
  • [21] Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants
    Khurshid, Faiza
    Coo, Helen
    Khalil, Amal
    Messiha, Jonathan
    Ting, Joseph Y.
    Wong, Jonathan
    Shah, Prakesh S.
    FRONTIERS IN PEDIATRICS, 2021, 9
  • [22] Sample Size Guidelines for Logistic Regression from Observational Studies with Large Population: Emphasis on the Accuracy Between Statistics and Parameters Based on Real Life Clinical Data
    Bujang, Mohamad Adam
    Sa'at, Nadiah
    Sidik, Tg Mohd Ikhwan Tg Abu Bakar
    Joo, Lim Chien
    MALAYSIAN JOURNAL OF MEDICAL SCIENCES, 2018, 25 (04): : 122 - 130
  • [23] A Large-Sample Confidence Interval for the Inverse Prediction of Quantile Differences in Logistic Regression for Two Independent Tests
    Hurwitz, Arnon
    Remund, Todd
    QUALITY ENGINEERING, 2014, 26 (04) : 460 - 466
  • [24] Binary Logistic Regression as a Method of Predicting Customer Dissatisfaction in Resolving Complaints: The Case of Bosnia and Herzegovina, Serbia and Former Yugoslav Republic of Macedonia
    Milovanovic, Mirjana
    Peric, Nenad
    QUALITY-ACCESS TO SUCCESS, 2019, 20 (173): : 27 - 31
  • [25] Large-sample properties of multiple imputation estimators for parameters of logistic regression with covariates missing at random separately or simultaneously
    Tran, Phuoc-Loc
    Lee, Shen-Ming
    Le, Truong-Nhat
    Li, Chin-Shang
    ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2025, 77 (02) : 251 - 287
  • [26] Robust Bayesian approach to logistic regression modeling in small sample size utilizing a weakly informative student's t prior distribution
    Asanya, Kenneth Chukwuemeka
    Kharrat, Mohamed
    Udom, Akaninyene Udo
    Torsen, Emmanuel
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2023, 52 (02) : 283 - 293
  • [27] Adjuvant surgical decision-making system for lumbar intervertebral disc herniation after percutaneous endoscopic lumber discectomy: a retrospective nonlinear multiple logistic regression prediction model based on a large sample
    Li, Yueyang
    Wang, Bo
    Li, Haiyin
    Chang, Xian
    Wu, Yu
    Hu, Zhilei
    Liu, Chenhao
    Gao, Xiaoxin
    Zhang, Yuyao
    Liu, Huan
    Li, Yongming
    Li, Changqing
    SPINE JOURNAL, 2021, 21 (12) : 2035 - 2048