Credit risk assessment using statistical and machine learning: basic methodology and risk modeling applications

被引:125
作者
Galindo, J.
Tamayo, P.
机构
[1] Harvard University,Department of Economics
[2] Thinking Machines Corp.,undefined
关键词
Risk Assessment; Training Sample; Average Error; Modeling Technique; Risk Model;
D O I
10.1023/A:1008699112516
中图分类号
学科分类号
摘要
Risk assessment of financial intermediaries is an area of renewed interest due to the financial crises of the 1980's and 90's. An accurate estimation of risk, and its use in corporate or global financial risk models, could be translated into a more efficient use of resources. One important ingredient to accomplish this goal is to find accurate predictors of individual risk in the credit portfolios of institutions. In this context we make a comparative analysis of different statistical and machine learning modeling methods of classification on a mortgage loan data set with the motivation to understand their limitations and potential. We introduced a specific modeling methodology based on the study of error curves. Using state-of-the-art modeling techniques we built more than 9,000 models as part of the study. The results show that CART decision-tree models provide the best estimation for default with an average 8.31% error rate for a training sample of 2,000 records. As a result of the error curve analysis for this model we conclude that if more data were available, approximately 22,000 records, a potential 7.32% error rate could be achieved. Neural Networks provided the second best results with an average error of 11.00%. The K-Nearest Neighbor algorithm had an average error rate of 14.95%. These results outperformed the standard Probit algorithm which attained an average error rate of 15.13%. Finally we discuss the possibilities to use this type of accurate predictive model as ingredients of institutional and global risk models.
引用
收藏
页码:107 / 143
页数:36
相关论文
共 29 条
[11]  
Pregibon D.(1973)An intertemporal capital asset pricing model Econometrica 41 867-293
[12]  
Smyth P.(1995)Bounds for predictive errors in the statistical mechanics of supervised learning Physical Review Letters 75 3772-36
[13]  
Keuzenkamp H.A.(1963)A simplified model for portfolio analysis Management Science 9 277-undefined
[14]  
McAleer M.(1975)A multivariate statistical analysis on the characteristics of problem banks Journal of Finance 30 21-undefined
[15]  
Landy A.(1986)Toward memory-based reasoning CACM 29 12l-undefined
[16]  
Meyer P.A.(1993)Statistical mechanics of learning from examples Physical Review A 45 6056-undefined
[17]  
Pifer H.W.(1994)Fuzzy logic, neural networks and soft computing Communications of the ACM 3 77-undefined
[18]  
Merton R.C.(undefined)undefined undefined undefined undefined-undefined
[19]  
Merton R.C.(undefined)undefined undefined undefined undefined-undefined
[20]  
Opper M.(undefined)undefined undefined undefined undefined-undefined