Team "NoConflict" at CASE 2021 Task 1: Pretraining for Sentence-Level Protest Event Detection

被引:0
作者
Hu, Tiancheng [1 ]
Stoehr, Niklas [1 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
来源
CASE 2021: THE 4TH WORKSHOP ON CHALLENGES AND APPLICATIONS OF AUTOMATED EXTRACTION OF SOCIO-POLITICAL EVENTS FROM TEXT (CASE) | 2021年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An ever-increasing amount of text, in the form of social media posts and news articles, gives rise to new challenges and opportunities for the automatic extraction of socio-political events. In this paper, we present our submission(1) to the Shared Tasks on Socio-Political and Crisis Events Detection, Task 1, Multilingual Protest News Detection, Subtask 2, Event Sentence Classification, of CASE @ ACL-IJCNLP 2021. In our submission, we utilize the RoBERTa model with additional pretraining, and achieve the best F1 score of 0:8532 in event sentence classification in English and the second-best F1 score of 0:8700 in Portuguese via simple translation. We analyze the failure cases of our model. We also conduct an ablation study to show the effect of choosing the right pretrained language model, adding additional training data and data augmentation.
引用
收藏
页码:152 / 160
页数:9
相关论文
共 23 条
[1]   BACK-TRANSLATION FOR CROSS-CULTURAL RESEARCH [J].
BRISLIN, RW .
JOURNAL OF CROSS-CULTURAL PSYCHOLOGY, 1970, 1 (03) :185-216
[2]  
Clark Kevin, 2020, ELECTRA PRETRAINING, DOI DOI 10.48550/ARXIV.2003.10555
[3]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[4]  
diaeresis>glu Ali Hurriyeto<spacing, 2020, P WORKSH AUT EXTR SO, P1
[5]  
Edunov S, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P489
[6]  
Finlay P.J., 2021, ARGOS TRANSLATE
[7]   The POLUSA Dataset: 0.9M Political News Articles Balanced by Time and Outlet Popularity [J].
Gebhard, Lukas ;
Hamborg, Felix .
PROCEEDINGS OF THE ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES IN 2020, JCDL 2020, 2020, :467-468
[8]  
Gururangan Suchin, 2020, P 58 ANN M ASS COMP, P8342
[9]  
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.8.1735, 10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
[10]   Cross-Context News Corpus for Protest Event-Related Knowledge Base Construction [J].
Hurriyetoglu, Ali ;
Yoruk, Erdem ;
Mutlu, Osman ;
Durusan, Firat ;
Yoltar, Cagri ;
Yuret, Deniz ;
Gurel, Burak .
DATA INTELLIGENCE, 2021, 3 (02) :308-335