Applying automatic text-based detection of deceptive language to police reports: Extracting behavioral patterns from a multi-step classification model to understand how we lie to the police

被引:16
作者
Quijano-Sanchez, Lara [1 ]
Liberatore, Federico [1 ,2 ]
Camacho-Collados, Jose [3 ]
Camacho-Collados, Miguel [4 ]
机构
[1] Univ Carlos III Madrid, BS Inst Financial Big Data UC3M, Madrid, Spain
[2] Univ Complutense Madrid, Dept Stat & Operat Res, Madrid, Spain
[3] Univ Roma La Sapienza, Dept Comp Sci, Rome, Italy
[4] Interior Minist, State Secretariat Secur, Madrid, Spain
关键词
Lie detection; Information extraction; Predictive policing; Model knowledge extraction; Natural language processing; Decision support systems; REGRESSION; CUES;
D O I
10.1016/j.knosys.2018.03.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Filing a false police report is a crime that has dire consequences on both the individual and the system. In fact, it may be charged as a misdemeanor or a felony. For the society, a false report results in the loss of police resources and contamination of police databases used to carry out investigations and assessing the risk of crime in a territory. In this research, we present VeriPol, a model for the detection of false robbery reports based solely on their text. This tool, developed in collaboration with the Spanish National Police, combines Natural Language Processing and Machine Learning methods in a decision support system that provides police officers the probability that a given report is false. VeriPol has been tested on more than 1000 reports from 2015 provided by the Spanish National Police. Empirical results show that it is extremely effective in discriminating between false and true reports with a success rate of more than 91%, improving by more than 15% the accuracy of expert police officers on the same dataset. The underlying classification model can be analysed to extract patterns and insights showing how people lie to the police (as well as how to get away with false reporting). In general, the more details provided in the report, the more likely it is to be honest. Finally, a pilot study carried out in June 2017 has demonstrated the usefulness of VeriPol on the field. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:155 / 168
页数:14
相关论文
共 58 条
  • [1] [Anonymous], 2004, P 2004 C EMP METH NA
  • [2] [Anonymous], 2011, 49 ANN M ASS COMP LI, DOI DOI 10.1145/2567948.2577293
  • [3] [Anonymous], 2018, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
  • [4] [Anonymous], 2007, The development and psychometric properties of LIWC2007
  • [5] Benchmarking state-of-the-art classification algorithms for credit scoring
    Baesens, B
    Van Gestel, T
    Viaene, S
    Stepanova, M
    Suykens, J
    Vanthienen, J
    [J]. JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2003, 54 (06) : 627 - 635
  • [6] Bolukbasi T, 2016, ADV NEUR IN, V29
  • [7] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [8] Detecting Deceptive Opinions: Intra and Cross-Domain Classification Using an Efficient Representation
    Cagnina, Leticia C.
    Rosso, Paolo
    [J]. INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2017, 25 : 151 - 174
  • [9] A Decision Support System for predictive police patrolling
    Camacho-Collados, M.
    Liberatore, F.
    [J]. DECISION SUPPORT SYSTEMS, 2015, 75 : 25 - 37
  • [10] Chen Y., 2015, Proc Assoc Inf Sci Technol, V52, P1, DOI [DOI 10.1002/PRA2.2015.145052010083, 10.1002/pra2.2015.145052010083]