Bankruptcy prediction is of paramount interest to both academics and practitioners. This paper devotes special care to an important aspect of the bankruptcy prediction modeling: Data sample selection issue. To investigate the effect of the different data selection methods, three models are adopted: Logistic regression model, Neural Networks (NNET), and Support Vector Machines (SVM), which have recently gained some popularity in the applications. A Monte Carlo simulation study and an empirical analysis on an updated bankruptcy database are conducted to explore the effect of different data sample selection methods. By comparing the out-of-sample predictive performances, we conclude that if forecasting the probability of bankruptcy is of interest, complete data sampling technique provides more accurate results. However, if a binary bankruptcy decision or classification is desired, choice based sampling technique may still be suitable. In particular, choice-based data samples validated by NNET and SVM can capture more correct predictions of bankruptcy observations, and provide lower asymmetric misclassification rate. In addition, for different choice-based data samples, it is essential to adjust the cut-off probability. An appropriate choice of cut-off probability depends on the specification of the cost ratio between the Type I error and Type II error. The proposed optimal cut-off probability in this work is a function of the data sample selection methods and the cost ratio.
引用
收藏
页码:91 / 116
页数:26
相关论文
共 30 条
[1]
Altman E.I., 1977, J BANK FINANC, V1, P29, DOI DOI 10.1016/0378-4266(77)90017-6