Automated Fact-Checking of Claims from Wikipedia

被引:0
|
作者
Sathe, Aalok [1 ]
Ather, Salar [1 ]
Tuan Manh Le [1 ]
Perry, Nathan [2 ]
Park, Joonsuk [1 ]
机构
[1] Univ Richmond, Dept Math & Comp Sci, Richmond, VA 23173 USA
[2] Williams Coll, Dept Comp Sci, Williamstown, MA 01267 USA
来源
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020) | 2020年
关键词
fact-checking; fact-verification; natural language inference; textual entailment; corpus;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Automated fact checking is becoming increasingly vital as both truthful and fallacious information accumulate online. Research on fact checking has benefited from large-scale datasets such as FEVER and SNLI. However, such datasets suffer from limited applicability due to the synthetic nature of claims and/or evidence written by annotators that differ from real claims and evidence on the internet. To this end, we present WIKIFACTCHECK-ENGLISH, a dataset of 124k+ triples consisting of a claim, context and an evidence document extracted from English Wikipedia articles and citations, as well as 34k+ manually written claims that are refuted by the evidence documents. This is the largest fact checking dataset consisting of real claims and evidence to date; it will allow the development of fact checking systems that can better process claims and evidence in the real world. We also show that for the NLI subtask, a logistic regression system trained using existing and novel features achieves peak accuracy of 68%, providing a competitive baseline for future work. Also, a decomposable attention model trained on SNLI significantly underperforms the models trained on this dataset, suggesting that models trained on manually generated data may not be sufficiently generalizable or suitable for fact checking real-world claims.
引用
收藏
页码:6874 / 6882
页数:9
相关论文
共 50 条
  • [41] Relevant Document Discovery for Fact-Checking Articles
    Wang, Xuezhi
    Yu, Cong
    Baumgartner, Simon
    Korn, Flip
    COMPANION PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2018 (WWW 2018), 2018, : 525 - 533
  • [42] Communities of practice in the production and resourcing of fact-checking
    Brookes, Stephanie
    Waller, Lisa
    JOURNALISM, 2023, 24 (09) : 1938 - 1958
  • [43] Perceived social presence reduces fact-checking
    Jun, Youjung
    Meng, Rachel
    Johar, Gita Venkataramani
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2017, 114 (23) : 5976 - 5981
  • [44] Network segregation in a model of misinformation and fact-checking
    Tambuscio M.
    Oliveira D.F.M.
    Ciampaglia G.L.
    Ruffo G.
    Journal of Computational Social Science, 2018, 1 (2): : 261 - 275
  • [45] Factoring Fact-Checks: Structured Information Extraction from Fact-Checking Articles
    Jiang, Shan
    Baumgartner, Simon
    Ittycheriah, Abe
    Yu, Cong
    WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, : 1592 - 1603
  • [46] Comparison of the Transparency of Fact-checking: A Global Perspective
    Ye, Qiong
    JOURNALISM PRACTICE, 2023, 17 (10) : 2263 - 2282
  • [47] Fact-Checking of AI-Generated Reports
    Mahmood, Razi
    Wang, Ge
    Kalra, Mannudeep
    Yan, Pingkun
    MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2023, PT II, 2024, 14349 : 214 - 223
  • [48] Thoughts on fact-checking agencies and their civil liability
    Neto, David Cury
    CADERNOS DE DEREITO ACTUAL, 2022, (19): : 381 - 400
  • [49] EXPLAINABLE FACT-CHECKING THROUGH QUESTION ANSWERING
    Yang, Jing
    Vega-Oliveros, Didier
    Seibt, Tais
    Rocha, Anderson
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8952 - 8956
  • [50] Who Uses Fact-Checking Sites? The Impact of Demographics, Political Antecedents, and Media Use on Fact-Checking Site Awareness, Attitudes, and Behavior
    Robertson, Craig T.
    Mourao, Rachel R.
    Thorson, Esther
    INTERNATIONAL JOURNAL OF PRESS-POLITICS, 2020, 25 (02) : 217 - 237