CNN Based Malicious Website Detection by Invalidating Multiple Web Spams

被引：18

作者：

Liu, Dongjie ^{[1
,2
]}

Lee, Jong-Hyouk ^{[3
]}

机构：

[1] Chinese Acad Sci, Comp Network Informat Ctr, Beijing 100190, Peoples R China

[2] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100190, Peoples R China

[3] Sejong Univ, Dept Comp & Informat Secur, Seoul 13557, South Korea

来源：

IEEE ACCESS | 2020年 / 8卷

关键词：

Machine learning; Internet; Browsers; Uniform resource locators; Support vector machines; Feature extraction; Crawlers; Convolutional neural network; machine learning; malicious website detection; NEURAL-NETWORK; DEEP CNN;

D O I：

10.1109/ACCESS.2020.2995157

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Although a variety of techniques to detect malicious websites have been proposed, it becomes more and more difficult for those methods to provide a satisfying result nowadays. Many malicious websites can still escape detection with various Web spam techniques. In this paper, we first summarize three types of Web spam techniques used by malicious websites, such as redirection spam, hidden IFrame spam, and content hiding spam. We then present a new detection method that adopts the perspective of users and takes screenshots of malicious webpages to invalidate Web spams. The proposed detection method uses a Convolutional Neural Network, which is a class of deep neural networks, as a classification algorithm. In order to verify the effectiveness of the method, two different experiments have been conducted. First, the proposed method was tested based on a constructed complex dataset. We present comparison results between the proposed method and representative machine learning-based detection algorithms. Second, the proposed method was tested to detect malicious websites in a real-world Web environment for three months. These experimental results illustrate that the proposed method has a better performance and is applicable to a practical Web environment.

引用

页码：97258 / 97266

页数：9

共 50 条

[31] Malicious URL Detection Based on Associative Classification
Kumi, Sandra
Lim, ChaeHo
Lee, Sang-Gon
ENTROPY, 2021, 23 (02) : 1 - 12
[32] Exploiting Feature Interactions for Malicious Website Detection with Overhead-accuracy Tradeoff
Shen, Shuaiqi
Yu, Chong
Zhang, Kuan
Ci, Song
IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2021), 2021,
[33] TSMWD: A High-speed Malicious Web Page Detection System Based on Two-Step Classifiers
Wang, Zhengqi
Feng, Xiaobing
Niu, Yukun
Zhang, Chi
Su, Jue
2017 INTERNATIONAL CONFERENCE ON NETWORKING AND NETWORK APPLICATIONS (NANA), 2017, : 170 - 175
[34] Phishing Website Detection Based on Multidimensional Features Driven by Deep Learning
Yang, Peng
Zhao, Guangzhen
Zeng, Peng
IEEE ACCESS, 2019, 7 : 15196 - 15209
[35] Malicious Webpage Classification Based on Web Content Features using Machine Learning and Deep Learning
Raja, Saleem A.
Sundarvadivazhagan, B.
Vijayarangan, R.
Veeramani, S.
2022 INTERNATIONAL CONFERENCE ON GREEN ENERGY, COMPUTING AND SUSTAINABLE TECHNOLOGY (GECOST), 2022, : 314 - 319
[36] Ransomware detection with CNN and deep learning based on multiple features of portable executable files
Yang, Chia-Cheng
Hsu, Jia-Ming
Leu, Jenq-Shiou
Hsieh, Wen-Bin
JOURNAL OF SUPERCOMPUTING, 2025, 81 (05)
[37] An efficient multistage phishing website detection model based on the CASE feature framework: Aiming at the real web environment
Liu, Dong-Jie
Geng, Guang-Gang
Jin, Xiao-Bo
Wang, Wei
COMPUTERS & SECURITY, 2021, 110
[38] Transfer learning-based deep CNN model for multiple faults detection in SCIM
Kumar, Prashant
Hati, Ananda Shankar
NEURAL COMPUTING & APPLICATIONS, 2021, 33 (22) : 15851 - 15862
[39] Lexical features based malicious URL detection using machine learning techniques
Saleem Raja, A.
Vinodini, R.
Kavitha, A.
MATERIALS TODAY-PROCEEDINGS, 2021, 47 : 163 - 166
[40] Classification of Malicious URLs by CNN Model Based on Genetic Algorithm
Wu, Tiefeng
Xi, Yunfang
Wang, Miao
Zhao, Zhichao
APPLIED SCIENCES-BASEL, 2022, 12 (23):

← 1 2 3 4 5 →