Check-worthy claim detection across topics for automated fact-checking

被引:1
作者
Abumansour, Amani S. [1 ,2 ]
Zubiaga, Arkaitz [1 ]
机构
[1] Queen Mary Univ London, London, England
[2] Taif Univ, Taif, Saudi Arabia
关键词
Check-worthiness; Check-worthy; Claim detection cross-topic; Automated fact-checking system; LAB;
D O I
10.7717/peerj-cs.1365
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An important component of an automated fact-checking system is the claim checkworthiness detection system, which ranks sentences by prioritising them based on their need to be checked. Despite a body of research tackling the task, previous research has overlooked the challenging nature of identifying check-worthy claims across different topics. In this article, we assess and quantify the challenge of detecting check-worthy claims for new, unseen topics. After highlighting the problem, we propose the AraCWA model to mitigate the performance deterioration when detecting check-worthy claims across topics. The AraCWA model enables boosting the performance for new topics by incorporating two components for fewshot learning and data augmentation. Using a publicly available dataset of Arabic tweets consisting of 14 different topics, we demonstrate that our proposed data augmentation strategy achieves substantial improvements across topics overall, where the extent of the improvement varies across topics. Further, we analyse the semantic similarities between topics, suggesting that the similarity metric could be used as a proxy to determine the difficulty level of an unseen topic prior to undertaking the task of labelling the underlying sentences.
引用
收藏
页数:19
相关论文
共 30 条
[1]  
Abumansour AS, 2021, CLEF (Working Notes), P424
[2]  
Antoun W., 2020, Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, P9
[3]  
Antoun Wissam, 2021, P 6 AR NAT LANG PROC, P196
[4]   Automatic Fact-Checking Using Context and Discourse Information [J].
Atanasova, Pepa ;
Nakov, Preslav ;
Marquez, Lluis ;
Barron-Cedeno, Alberto ;
Karadzhov, Georgi ;
Mihaylova, Tsvetomila ;
Mohtarami, Mitra ;
Glass, James .
ACM JOURNAL OF DATA AND INFORMATION QUALITY, 2019, 11 (03)
[5]  
Atanasova Pepa, 2018, arXiv
[6]  
Conneau Alexis., 2017, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, P670, DOI [10.18653/v1/D17-1070, 10.18653/v1/d17-1070, DOI 10.18653/V1/D17-1070]
[7]  
Derczynski Leon., 2017, Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), P69, DOI DOI 10.18653/V1/S17-2006
[8]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[9]   Overview of the CLEF-2019 CheckThat! Lab: Automatic Identification and Verification of Claims [J].
Elsayed, Tamer ;
Nakov, Preslav ;
Barron-Cedeno, Alberto ;
Hasanain, Maram ;
Suwaileh, Reem ;
Da San Martino, Giovanni ;
Atanasova, Pepa .
EXPERIMENTAL IR MEETS MULTILINGUALITY, MULTIMODALITY, AND INTERACTION (CLEF 2019), 2019, 11696 :301-321
[10]  
Feng SY, 2021, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, P968