Incorporating Multiple Textual Factors into Unbalanced Financial Distress Prediction: A Feature Selection Methods and Ensemble Classifiers Combined Approach

被引:0
作者
Shixuan Li
Wenxuan Shi
机构
[1] Wuhan University of Technology,School of Safety Science and Emergency Management
[2] Wuhan University,School of Information Management
来源
International Journal of Computational Intelligence Systems | / 16卷
关键词
Textual factors; Feature selection; Ensemble classifiers; Financial distress prediction; Word embedding;
D O I
暂无
中图分类号
学科分类号
摘要
Textual-based factors have been widely regarded as a promising feature that can be applied to financial issues. This study focuses on extracting both basic and semantic textual features to supplement the traditionally used financial indicators. The main is to improve Chinese listed companies’ financial distress prediction (FDP). A unique paradigm is proposed in this study that combines financial and multi-type textual predictive factors, feature selection methods, classifiers, and time spans to achieve the optimal FDP. The frequency counts, TF-IDF, TextRank, and word embedding approaches are employed to extract frequency count-based, keyword-based, sentiment, and readability indicators. The experimental results prove that financial domain sentiment lexicons, word embedding-based readability analysis approaches, and the basic textual features of Management Discussion and Analysis can be important elements of FDP. Moreover, the finding highlights the fact that incorporating financial and textual features can achieve optimal performance 4 or 5 years before the expected baseline year; applying the RF-GBDT combined model can also outperform other classifiers. This study makes an innovative contribution, since it expands the multiple text analysis method in the financial text mining field and provides new findings on how to provide early warning signs related to financial risk. The approaches developed in this research can serve as a template that can be used to resolve other financial issues.
引用
收藏
相关论文
共 131 条
[1]  
Tang X(2020)Incorporating textual and management factors into financial distress prediction: a comparative study of machine learning methods J. Forecast. 39 769-787
[2]  
Li S(2019)Predicting multistage financial distress: reflections on sampling, feature and model selection criteria J. Forecast. 38 632-648
[3]  
Tan M(2016)Financial credit risk assessment: a recent review Artif. Intell. Rev. 45 1-23
[4]  
Shi W(2018)A new random subspace method incorporating sentiment and textual information for financial distress prediction Electron. Commer. Res. Appl. 29 30-49
[5]  
Farooq U(2014)Forecasting corporate financial performance using sentiment in annual reports for stakeholders' decision-making Technol. Econ. Dev. Econ. 20 721-738
[6]  
Qamar MAJ(2016)A two-stage classification technique for bankruptcy prediction Eur. J. Oper. Res. 254 236-252
[7]  
Chen N(2016)Financial ratios and corporate governance indicators in bankruptcy prediction: a comprehensive study Eur. J. Oper. Res. 252 561-572
[8]  
Ribeiro B(2018)Corporate distress prediction in China: a machine learning approach Account. Finance 58 1063-1109
[9]  
Chen A(2019)Feature selection in single and ensemble learning-based bankruptcy prediction models Expert. Syst. 36 1-8
[10]  
Wang G(2021)Combining feature selection, instance selection, and ensemble classification techniques for improved financial distress prediction J. Bus. Res. 130 200-209