Team "DaDeFrNi" at CASE 2021 Task 1: Document and Sentence Classification for Protest Event Detection

被引:0
作者
Re, Francesco Ignazio [1 ]
Vegh, Daniel [1 ]
Atzenhofer, Dennis [1 ]
Stoehr, Niklas [1 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
来源
CASE 2021: THE 4TH WORKSHOP ON CHALLENGES AND APPLICATIONS OF AUTOMATED EXTRACTION OF SOCIO-POLITICAL EVENTS FROM TEXT (CASE) | 2021年
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper accompanies our top-performing submission to the CASE 2021 shared task, which is hosted at the workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text. Subtasks 1 and 2 of Task 1 concern the classification of newspaper articles and sentences into "conflict" versus "not conflict"-related in four different languages. Our model performs competitively in both subtasks (up to 0.8662 macro F1), obtaining the highest score of all contributions for subtask 1 on Hindi articles (0.7877 macro F1). We describe all experiments conducted with the XLM-RoBERTa (XLM-R) model and report results obtained in each binary classification task. We propose supplementing the original training data with additional data on political conflict events. In addition, we provide an analysis of unigram probability estimates and geospatial references contained within the original training corpus.
引用
收藏
页码:171 / 178
页数:8