Multimodal Machine Learning for Credit Modeling

被引:6
作者
Nguyen, Cuong, V [1 ]
Das, Sanjiv R. [2 ,3 ]
He, John [4 ]
Yue, Shenghua [4 ]
Hanumaiah, Vinay [4 ]
Ragot, Xavier [5 ]
Zhang, Li [6 ]
机构
[1] Amazon Web Serv, Pasadena, CA 91125 USA
[2] AWS, Palo Alto, CA 94303 USA
[3] Santa Clara Univ, Palo Alto, CA 94303 USA
[4] Amazon Web Serv, Palo Alto, CA 94303 USA
[5] Amazon Web Serv, San Francisco, CA 94111 USA
[6] Amazon Web Serv, New York, NY 10001 USA
来源
2021 IEEE 45TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2021) | 2021年
关键词
credit ratings; multimodal; machine learning; long-form text; FINANCIAL RATIOS; PREDICTION; TEXT;
D O I
10.1109/COMPSAC51774.2021.00262
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Credit ratings are traditionally generated using models that use financial statement data and market data, which is tabular (numeric and categorical). Practitioner and academic models do not include text data. Using an automated approach to combine long-form text from SEC filings with the tabular data, we show how multimodal machine learning using stack ensembling and bagging can generate more accurate rating predictions. This paper demonstrates a methodology to use big data to extend tabular data models, which have been used by the ratings industry for decades, to the class of multimodal machine learning models.
引用
收藏
页码:1754 / 1759
页数:6
相关论文
共 30 条
  • [1] A fifty-year retrospective on credit risk models, the Altman Z-score family of models and their applications to financial markets and managerial strategies
    Altman, Edward I.
    [J]. JOURNAL OF CREDIT RISK, 2018, 14 (04): : 1 - 34
  • [2] FINANCIAL RATIOS, DISCRIMINANT ANALYSIS AND PREDICTION OF CORPORATE BANKRUPTCY
    ALTMAN, EI
    [J]. JOURNAL OF FINANCE, 1968, 23 (04) : 589 - 609
  • [3] PRICING OF OPTIONS AND CORPORATE LIABILITIES
    BLACK, F
    SCHOLES, M
    [J]. JOURNAL OF POLITICAL ECONOMY, 1973, 81 (03) : 637 - 654
  • [4] Using 10-K Text to Gauge Financial Constraints
    Bodnaruk, Andriy
    Loughran, Tim
    McDonald, Bill
    [J]. JOURNAL OF FINANCIAL AND QUANTITATIVE ANALYSIS, 2015, 50 (04) : 623 - 646
  • [5] A plain English measure of financial reporting readability
    Bonsall, Samuel B.
    Leone, Andrew J.
    Miller, Brian P.
    Rennekamp, Kristina
    [J]. JOURNAL OF ACCOUNTING & ECONOMICS, 2017, 63 (2-3) : 329 - 357
  • [6] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [7] Cao S., 2020, 3683802 SSRN
  • [8] Caruana R., 2004, P 21 INT C MACH LEAR, P18
  • [9] XGBoost: A Scalable Tree Boosting System
    Chen, Tianqi
    Guestrin, Carlos
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 785 - 794
  • [10] High Prevalence of Assisted Injection Among Street-Involved Youth in a Canadian Setting
    Cheng, Tessa
    Kerr, Thomas
    Small, Will
    Dong, Huiru
    Montaner, Julio
    Wood, Evan
    DeBeck, Kora
    [J]. AIDS AND BEHAVIOR, 2016, 20 (02) : 377 - 384