An Empirical Study on Detecting Deception and Cybercrime Using Artificial Neural Networks

被引：1

作者：

Mbaziira, Alex V. ^{[1
]}

Murphy, Diane R. ^{[1
]}

机构：

[1] Marymount Univ, 1000 North Glebe Rd, Arlington, VA 22207 USA

来源：

PROCEEDINGS OF THE 2018 2ND INTERNATIONAL CONFERENCE ON COMPUTE AND DATA ANALYSIS (ICCDA 2018) | 2015年

关键词：

Cybercrime; deception; supervised learning; artificial neural networks; natural language processing; PREDICTING DECEPTION; WORDS;

D O I：

10.1145/3193077.3193080

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Ubiquity of the Internet and wide adoption of the computing and mobile devices is driving explosion of data. Interestingly, cybercriminals are also leveraging these popular technologies to cash in on cybercrime in form of scams, fraud and fake online reviews. Existing content filtering techniques, which have been successful in containing spam, are failing to filter these new types of cybercrime because cybercriminals generate text messages to bypass content filters. In this paper, we use natural language processing and a deception-detection discourse to build hybrid models for detecting these forms of text-based cybercrime. Since we have four datasets each of which contains deceptive text messages representing a specific type of cybercrime and truthful text messages, we combine 2 datasets and 3 datasets together to generate training sets for the hybrid models with more than one type of cybercrime. The hybrid cybercrime detection models are trained using Artificial Neural Networks (ANN), Naive Bayes (NB), Support Vector Machines (SVM) and kth Nearest Neighbor (kNN). The models are then evaluated on test sets containing instances that were not part of the training sets. The results for model performance of NB, kNN and SVM classifiers are compared against those of ANN. Most the models generalize well in detecting cybercrime. ANN model performance on the test sets ranges from 70% to 90% accuracy compared to model performance range of 60% to 80% for the other three classifiers. The best performance is in detecting unfavorable fake reviews and fraud.

引用

页码：42 / 46

页数：5

共 23 条

[11]

Fitzpatrick E., 2015, Synth. Lectures Human Lang. Technol., V8, P1

[12]

Mbaziira A., 2016, INT C TECHN

[13]

Mbaziira A. V., 2017, P INT C COMP DAT AN, P23

[14] Lying words: Predicting deception from linguistic styles [J].

Newman, ML ;

Pennebaker, JW ;

Berry, DS ;

Richards, JM .

PERSONALITY AND SOCIAL PSYCHOLOGY BULLETIN, 2003, 29 (05) :665-675

[15] Detecting authorship deception: a supervised machine learning approach using author writeprints [J].

Pearl, Lisa ;

Steyvers, Mark .

LITERARY AND LINGUISTIC COMPUTING, 2012, 27 (02) :183-196

[16] Neural networks for deceptive opinion spam detection: An empirical study [J].

Ren, Yafeng ;

Ji, Donghong .

INFORMATION SCIENCES, 2017, 385 :213-224

[17]

Reynolds K., 2011, Proceedings of the 2011 Tenth International Conference on Machine Learning and Applications (ICMLA 2011), P241, DOI 10.1109/ICMLA.2011.152

[18]

Sarvari H., 2014, IEEE SEC PRIV WORKSH, P8

[19]

Shojaee S, 2013, INT CONF INTELL SYST, P53, DOI 10.1109/ISDA.2013.6920707

[20]

Tan P-N., 2014, INTRO DATA MINING

← 1 2 3 →