An Empirical Study on Detecting Deception and Cybercrime Using Artificial Neural Networks

被引：1

作者：

Mbaziira, Alex V. ^{[1
]}

Murphy, Diane R. ^{[1
]}

机构：

[1] Marymount Univ, 1000 North Glebe Rd, Arlington, VA 22207 USA

来源：

PROCEEDINGS OF THE 2018 2ND INTERNATIONAL CONFERENCE ON COMPUTE AND DATA ANALYSIS (ICCDA 2018) | 2015年

关键词：

Cybercrime; deception; supervised learning; artificial neural networks; natural language processing; PREDICTING DECEPTION; WORDS;

D O I：

10.1145/3193077.3193080

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Ubiquity of the Internet and wide adoption of the computing and mobile devices is driving explosion of data. Interestingly, cybercriminals are also leveraging these popular technologies to cash in on cybercrime in form of scams, fraud and fake online reviews. Existing content filtering techniques, which have been successful in containing spam, are failing to filter these new types of cybercrime because cybercriminals generate text messages to bypass content filters. In this paper, we use natural language processing and a deception-detection discourse to build hybrid models for detecting these forms of text-based cybercrime. Since we have four datasets each of which contains deceptive text messages representing a specific type of cybercrime and truthful text messages, we combine 2 datasets and 3 datasets together to generate training sets for the hybrid models with more than one type of cybercrime. The hybrid cybercrime detection models are trained using Artificial Neural Networks (ANN), Naive Bayes (NB), Support Vector Machines (SVM) and kth Nearest Neighbor (kNN). The models are then evaluated on test sets containing instances that were not part of the training sets. The results for model performance of NB, kNN and SVM classifiers are compared against those of ANN. Most the models generalize well in detecting cybercrime. ANN model performance on the test sets ranges from 70% to 90% accuracy compared to model performance range of 60% to 80% for the other three classifiers. The best performance is in detecting unfavorable fake reviews and fraud.

引用

页码：42 / 46

页数：5

共 23 条

[1]

[Anonymous], 2013, TECHREPUBLIC 0916

[2]

[Anonymous], 2011, 49 ANN M ASS COMP LI, DOI DOI 10.1145/2567948.2577293

[3] LIBSVM: A Library for Support Vector Machines [J].

Chang, Chih-Chung ;

Lin, Chih-Jen .

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)

[4]

Chen XL, 2014, STUD BIG DATA, V3, P133, DOI 10.1007/978-3-642-45252-9_9

[5] Detecting Offensive Language in Social Media to Protect Adolescent Online Safety [J].

Chen, Ying ;

Zhou, Yilu ;

Zhu, Sencun ;

Xu, Heng .

PROCEEDINGS OF 2012 ASE/IEEE INTERNATIONAL CONFERENCE ON PRIVACY, SECURITY, RISK AND TRUST AND 2012 ASE/IEEE INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING (SOCIALCOM/PASSAT 2012), 2012, :71-80

[6]

Cohen W.W, 2015, ENRON EMAIL DATASET

[7]

DoJ, 2013, FORM ENR CEO JEFFR S

[8]

Engel P., 2015, BUSINESS INSIDER

[9]

Feng V.W., 2013, International Joint Conference on Natural Language Processing, P338

[10]

Firte Loredana, 2010, Proceedings of the 2010 IEEE 6th International Conference on Intelligent Computer Communication and Processing (ICCP 2010), P27, DOI 10.1109/ICCP.2010.5606466

← 1 2 3 →