A Comparison of Pre-Trained Language Models for Multi-Class Text Classification in the Financial Domain

被引:38
作者
Arslan, Yusuf [1 ]
Allix, Kevin [1 ]
Veiber, Lisa [1 ]
Lothritz, Cedric [1 ]
Bissyande, Tegawende F. [1 ]
Klein, Jacques [1 ]
Goujon, Anne [2 ]
机构
[1] Univ Luxembourg, Luxembourg, Luxembourg
[2] BGL BNP Paribas, Luxembourg, Luxembourg
来源
WEB CONFERENCE 2021: COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2021) | 2021年
关键词
BERT; FinBERT; financial text classification; NAIVE BAYES;
D O I
10.1145/3442442.3451375
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural networks for language modeling have been proven effective on several sub-tasks of natural language processing. Training deep language models, however, is time-consuming and computationally intensive. Pre-trained language models such as BERT are thus appealing since (1) they yielded state-of-the-art performance, and (2) they offload practitioners from the burden of preparing the adequate resources (time, hardware, and data) to train models. Nevertheless, because pre-trained models are generic, they may underperform on specific domains. In this study, we investigate the case of multi-class text classification, a task that is relatively less studied in the literature evaluating pre-trained language models. Our work is further placed under the industrial settings of the financial domain. We thus leverage generic benchmark datasets from the literature and two proprietary datasets from our partners in the financial technological industry. After highlighting a challenge for generic pre-trained models (BERT, Disti1BERT, RoBERTa, XLNet, XLM) to classify a portion of the financial document dataset, we investigate the intuition that a specialized pre-trained model for financial documents, such as FinBERT, should be leveraged. Nevertheless, our experiments show that the FinBERT model, even with an adapted vocabulary, does not lead to improvements compared to the generic BERT models.
引用
收藏
页码:260 / 268
页数:9
相关论文
共 52 条
[1]  
Adhikari A., 2019, ABS190408398 CORR
[2]  
Aggarwal CharuC., 2012, MINING TEXT DATA, DOI [10.1007/978-1-4614-3223-46, DOI 10.1007/978-1-4614-3223-4.6]
[3]  
AlexWang Amanpreet Singh, 2019, P ICLR
[4]  
Alsentzer Emily, 2019, P 2 CLIN NATURAL LAN, P72, DOI 10.18653/v1/W19-1909
[5]  
Anne C, 2018, ARTIF INTELL RES, V7, P1
[6]  
[Anonymous], 2016, 2016 LETT JP MORGAN
[7]  
[Anonymous], 2019, THESIS IMPERIAL COLL
[8]  
[Anonymous], 2013, COMPUTING RES REPOSI
[9]  
[Anonymous], 2008, 20NEWS
[10]  
[Anonymous], 2019, CORR ABS190108746