Churn prediction via support vector classification: An empirical comparison

被引:3
作者
Maldonado, Sebastian [1 ]
机构
[1] Univ Los Andes, Santiago, Chile
关键词
Support vector machines; support vector data description; feature selection; class imbalance problem; data mining; FEATURE-SELECTION; ACCURACY;
D O I
10.3233/IDA-150774
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An empirical framework for customer churn prediction modeling is presented in this work. This task represents a very interesting business analytics challenge, given its highly class imbalanced nature, and the presence of noisy variables that adversely affect the prediction capabilities of classification models. In this work, two SVM-based techniques are compared: Support Vector Data Description (SVDD), and standard two-class SVMs. The proposed methodology involves the comparison of these two methods under different conditions of class imbalance and using different subsets of variables. Feature ranking is performed via the Fisher Score Criterion, while the class imbalance problem is dealt with through resampling techniques, namely random undersampling and SMOTE oversampling. Experiments on four customer churn prediction datasets show the advantages of SVDD: it outperforms standard SVM in terms of predictive performance, demonstrating the importance of techniques that take the class imbalance problem into account.
引用
收藏
页码:S135 / S147
页数:13
相关论文
共 40 条
[1]  
Asuncion Arthur, 2007, UCI machine learning repository
[2]  
Bach FR, 2006, J MACH LEARN RES, V7, P1713
[3]  
Blattberg R., 2008, DATABASE MARKETING A
[4]   Granting and managing loans for micro-entrepreneurs: New developments and practical experiences [J].
Bravo, Cristian ;
Maldonado, Sebastian ;
Weber, Richard .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2013, 227 (02) :358-366
[5]   Future trends in business analytics and optimization [J].
Brown, Donald E. ;
Famili, Fazel ;
Paass, Gerhard ;
Smith-Miles, Kate ;
Thomas, Lyn C. ;
Weber, Richard ;
Baeza-Yates, Ricardo ;
Bravo, Cristian ;
L'Huillier, Gaston ;
Maldonado, Sebastian .
INTELLIGENT DATA ANALYSIS, 2011, 15 (06) :1001-1017
[6]   A nested heuristic for parameter tuning in Support Vector Machines [J].
Carrizosa, Emilio ;
Martin-Barragan, Belen ;
Romero Morales, Dolores .
COMPUTERS & OPERATIONS RESEARCH, 2014, 43 :328-334
[7]   Supervised classification and mathematical optimization [J].
Carrizosa, Emilio ;
Romero Morales, Dolores .
COMPUTERS & OPERATIONS RESEARCH, 2013, 40 (01) :150-165
[8]   Detecting relevant variables and interactions in supervised classification [J].
Carrizosa, Emilio ;
Martin-Barragan, Belen ;
Morales, Dolores Romero .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2011, 213 (01) :260-269
[9]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[10]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)