TWilBert: Pre-trained deep bidirectional transformers for Spanish Twitter

被引:29
作者
Gonzalez, Jose Angel [1 ]
Hurtado, Lluis-F. [1 ]
Pla, Ferran [1 ]
机构
[1] Univ Politecn Valencia, VRAIN Valencian Res Inst Artificial Intelligence, Cami Vera Sn, Valencia 46022, Spain
关键词
Contextualized Embedd ngs; Spanish; Twitter; TWilBERT;
D O I
10.1016/j.neucom.2020.09.078
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, the Natural Language Processing community have been moving from uncontextualized word embeddings towards contextualized word embeddings. Among these contextualized architectures, BERT stands out due to its capacity to compute bidirectional contextualized word representations. However, its competitive performance in English downstream tasks is not obtained by its multilingual version when it is applied to other languages and domains. This is especially true in the case of the Spanish language used in Twitter. In this work, we propose TWiLBERT, a specialization of BERT architecture both for the Spanish language and the Twitter domain. Furthermore, we propose a Reply Order Prediction signal to learn inter-sentence coherence in Twitter conversations, which improves the performance of TWilBERT in text classification tasks that require reasoning on sequences of tweets. We perform an extensive evaluation of TWilBERT models on 14 different text classification tasks, such as irony detection, sentiment analysis, or emotion detection. The results obtained by TWilBERT outperform the state-of-the-art systems and Multilingual BERT. In addition, we carry out a thorough analysis of the TWilBERT models to study the reasons of their competitive behavior. We release the pre-trained TWilBERT models used in this paper, along with a framework for training, evaluating, and fine-tuning TWilBERT models. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:58 / 69
页数:12
相关论文
共 47 条
[1]  
[Anonymous], 2019, ADV NEUR IN
[2]  
[Anonymous], 2019, P IB LANG EV FOR COL
[3]  
Basile V., 2019, P 13 INT WORKSH SEM, P54, DOI 10.18653/v1/S19-2007
[4]  
Beltagy I, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P3615
[5]  
Bojanowski P., ARXIV160704606
[6]   What does BERT look at? An Analysis of BERT's Attention [J].
Clark, Kevin ;
Khandelwal, Urvashi ;
Levy, Omer ;
Manning, Christopher D. .
BLACKBOXNLP WORKSHOP ON ANALYZING AND INTERPRETING NEURAL NETWORKS FOR NLP AT ACL 2019, 2019, :276-286
[7]  
de Vries W, 2019, BERTJE DUTCH BERT MO
[8]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[9]  
Fayos M.G., 2017, P 2 WORKSH EV HUM LA, P1
[10]  
Gonzalez J., 2017, P 2 WORKSH EV HUM LA, P55