Towards Automated Fact-Checking: An Exploratory Study on Identifying Check-worthy Phrases for Verification

被引:0
作者
Bartol, Galo Emanuel Pianciola [1 ]
Tommasel, Antonela [2 ]
机构
[1] UNICEN, Fac Ciencias Exactas, Tandil, Buenos Aires, Argentina
[2] CONICET UNICEN, ISISTAN, Tandil, Buenos Aires, Argentina
来源
2024 L LATIN AMERICAN COMPUTER CONFERENCE, CLEI 2024 | 2024年
关键词
check-worthiness; fact-checking; social media; text analysis;
D O I
10.1109/CLEI64178.2024.10700241
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In today's information-saturated social media environment, it is essential to prioritize the verification of potentially false or misleading claims. This need has led to the development of fact-checking, a process dedicated to verifying the truthfulness of statements. Given the limited human resources available to scrutinize all online claims, it is crucial to identify the most critical ones to verify. Therefore, a (semi-)automated system capable of detecting the most urgent and relevant claims for verification is needed. To address this challenge, we evaluate an approach based on Natural Language Processing and Machine Learning techniques. We explore lexical features, embedding models, LLMs, and traditional classification techniques to develop an automated system to classify statements according to their relevance for verification (i.e., their check-worthiness). Our evaluation is based on data collections including checkable statements extracted from tweets and political speeches. Embedding and LLM-based techniques showed great potential to improve the performance of the verification process by effectively prioritizing the most critical and relevant statements for verification.
引用
收藏
页数:10
相关论文
共 30 条
[1]  
Agresti S., 2022, PROC WORKING NOTES C, P422
[2]  
Alam F., 2023, Working Notes of CLEF
[3]  
Arslan Fatma, 2020, P INT AAAI C WEB SOC, V14, P821
[4]  
Brown TB, 2020, ADV NEUR IN, V33
[5]   The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation [J].
Chicco, Davide ;
Jurman, Giuseppe .
BMC GENOMICS, 2020, 21 (01)
[6]  
Deshpande A, 2023, Arxiv, DOI [arXiv:2304.05335, DOI 10.48550/ARXIV.2304.05335]
[7]  
Ferrara E, 2018, COMPUT SOC SCI, P229, DOI 10.1007/978-3-319-77332-2_13
[8]  
Hassan N., 2015, world
[9]   Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by ClaimBuster [J].
Hassan, Naeemul ;
Arslan, Fatma ;
Li, Chengkai ;
Tremayne, Mark .
KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, :1803-1812
[10]  
Kotonya N, 2020, Arxiv, DOI [arXiv:2011.03870, 10.48550/arXiv.2011.03870, DOI 10.48550/ARXIV.2011.03870]