A Meta-Learning Approach to Predicting Financial Statement Fraud

被引:7
作者
Mckee, Thomas E. [1 ,2 ]
机构
[1] Med Univ South Carolina, Charleston, SC 29425 USA
[2] Norwegian Sch Econ & Business Adm, Bergen, Norway
关键词
meta-learning; fraud prediction; model stacking; classification tree algorithm; neural network; logistic regression;
D O I
10.2308/jeta.2009.6.1.5
中图分类号
F8 [财政、金融];
学科分类号
0202 ;
摘要
An "ultimate learning algorithm" is one that produces models that closely match the real world's underlying distribution of functions. To try to create such an algorithm, researchers typically employ manual algorithm design with cross-validation. It has been shown that cross-validation is not a viable way to construct an ultimate learning algorithm. For machine learning researchers, "meta-learning" should be more desirable than manual algorithm design with cross-validation. Meta-learning is concerned with gaining knowledge about learning methodologies. One meta-learning approach involves evaluating the suitability of various algorithms for a learning task in order to select an appropriate algorithm. An alternative approach is to incorporate predictions from base algorithms as features to be evaluated by subsequent algorithms. This paper reports on exploratory research that implemented the latter approach as a three-layer stacked generalization model using neural networks, logistic regression, and classification tree algorithms to predict all categories of financial fraud. The purpose was to see if this form of meta-learning offered significant benefits for financial fraud prediction. Fifteen possible financial fraud predictors were identified based on a theoretical fraud model from prior research. Only public data for these possible predictors were obtained from U.S. Securities and Exchange Commission filings from the period 19952002 for a sample of 50 fraud and 50 non-fraud companies. These data were selected for the year prior to when the fraud was initiated. These variables were used to create a variety of neural network, logistic regression, and classification tree models while using holdout sample and cross-validation techniques. A 71.4 percent accurate neural network model was then stacked into a logistic regression model, increasing the prediction accuracy to 76.5 percent. The logistic regression model was subsequently stacked into a classification tree model to achieve an 83 percent accuracy rate. These results compared favorably to two prior neural network studies, also employing only public data, which achieved 63 percent accuracy rates. Model results were also analyzed via probability-adjusted overall error rates, relative misclassification costs, and receiver operating characteristics. The increase in classification accuracy from 71 percent to 83 percent, the decline in estimated overall error rate from 0.0057 to 0.0035, and the decline in relative misclassification costs from 2.79 to 0.58 suggest that benefits were achieved by the meta-learning stacking approach. Further research into the meta-learning stacking approach appears warranted.
引用
收藏
页码:5 / 26
页数:22
相关论文
共 31 条
[1]  
Albrecht S., 1980, J ACCOUNTANCY, P63
[2]  
Albrecht W., 1986, ADV ACCOUNT, V3, P323
[3]  
Ansar M, 2011, FINANCIAL REPORTING
[4]  
Bell T, 1993, WORKING PAPER
[5]   A decision aid for assessing the likelihood of fraudulent financial reporting [J].
Bell, TB ;
Carcello, JV .
AUDITING-A JOURNAL OF PRACTICE & THEORY, 2000, 19 (01) :169-184
[6]  
Bishop T. J. F., 2001, AUDITORS REPORT, V24, P13
[7]   Using the artificial neural network to predict fraud litigation: Some empirical evidence from emerging markets [J].
Chen, Hsueh-Ju ;
Huang, Shalo-Yan ;
Kuo, Chung-Long .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) :1478-1484
[8]  
Committee of Sponsoring Organizations of the Treadway Commission (COSO), 1998, INT J INTELL SYST, V7, P21
[9]  
Eining MM, 1997, AUDITING-J PRACT TH, V16, P1
[10]   A comparison of selected artificial neural networks that help auditors evaluate client financial viability [J].
Etheridge, HL ;
Sriram, RS ;
Hsu, HYK .
DECISION SCIENCES, 2000, 31 (02) :531-550