The effect of feature selection on financial distress prediction

被引:150
作者
Liang, Deron [1 ]
Tsai, Chih-Fong [2 ]
Wu, Hsin-Ting [1 ]
机构
[1] Natl Cent Univ, Dept Comp Sci & Informat Engn, Tainan, Taiwan
[2] Natl Cent Univ, Dept Informat Management, Tainan, Taiwan
关键词
Financial distress prediction; Bankruptcy prediction; Credit scoring; Feature selection; Data mining; INTEGRATING FEATURE-SELECTION; PARTICLE SWARM OPTIMIZATION; SUPPORT VECTOR MACHINES; DISCRIMINANT-ANALYSIS; FAILURE PREDICTION; NEURAL-NETWORKS; CLASSIFICATION; ALGORITHMS; ENSEMBLE; BANKS;
D O I
10.1016/j.knosys.2014.10.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Financial distress prediction is always important for financial institutions in order for them to assess the financial health of enterprises and individuals. Bankruptcy prediction and credit scoring are two important issues in financial distress prediction where various statistical and machine learning techniques have been employed to develop financial prediction models. Since there are no generally agreed upon financial ratios as input features for model development, many studies consider feature selection as a pre-processing step in data mining before constructing the models. However, most works only focused on applying specific feature selection methods over either bankruptcy prediction or credit scoring problem domains. In this work, a comprehensive study is conducted to examine the effect of performing filter and wrapper based feature selection methods on financial distress prediction. In addition, the effect of feature selection on the prediction models obtained using various classification techniques is also investigated. In the experiments, two bankruptcy and two credit datasets are used. In addition, three filter and two wrapper based feature selection methods combined with six different prediction models are studied. Our experimental results show that there is no the best combination of the feature selection method and the classification technique over the four datasets. Moreover, depending on the chosen techniques, performing feature selection does not always improve the prediction performance. However, on average performing the genetic algorithm and logistic regression for feature selection can provide prediction improvements over the credit and bankruptcy datasets respectively. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:289 / 297
页数:9
相关论文
共 45 条
[1]   FINANCIAL RATIOS, DISCRIMINANT ANALYSIS AND PREDICTION OF CORPORATE BANKRUPTCY [J].
ALTMAN, EI .
JOURNAL OF FINANCE, 1968, 23 (04) :589-609
[2]  
Balcaen S., 2006, BRIT ACCOUNT REV, V38, P63, DOI [DOI 10.1016/J.BAR.2005.09.001, 10.1016/j.bar.2005.09.001]
[3]   Selection of relevant features and examples in machine learning [J].
Blum, AL ;
Langley, P .
ARTIFICIAL INTELLIGENCE, 1997, 97 (1-2) :245-271
[4]   Failure prediction of dotcom companies using hybrid intelligent techniques [J].
Chandra, D. Karthik ;
Ravi, V. ;
Bose, I. .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) :4830-4837
[5]   Combination of feature selection approaches with SVM in credit scoring [J].
Chen, Fei-Long ;
Li, Feng-Chia .
EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (07) :4902-4909
[6]   Feature selection for text classification with Naive Bayes [J].
Chen, Jingnian ;
Huang, Houkuan ;
Tian, Shengfeng ;
Qu, Youli .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) :5432-5435
[7]   Classifying credit ratings for Asian banks using integrating feature selection and the CPDA-based rough sets approach [J].
Chen, You-Shyang .
KNOWLEDGE-BASED SYSTEMS, 2012, 26 :259-270
[8]   A hybrid approach based on the combination of variable selection using decision trees and case-based reasoning using the Mahalanobis distance: For bankruptcy prediction [J].
Cho, Sungbin ;
Hong, Hyojung ;
Ha, Byoung-Chun .
EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (04) :3482-3488
[9]   The properties of high-dimensional data spaces: implications for exploring gene and protein expression data [J].
Clarke, Robert ;
Ressom, Habtom W. ;
Wang, Antai ;
Xuan, Jianhua ;
Liu, Minetta C. ;
Gehan, Edmund A. ;
Wang, Yue .
NATURE REVIEWS CANCER, 2008, 8 (01) :37-49
[10]   Recent developments in consumer credit risk assessment [J].
Crook, Jonathan N. ;
Edelman, David B. ;
Thomas, Lyn C. .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2007, 183 (03) :1447-1465