ZipAST: Enhancing malicious Java']JavaScript detection with sequence compression

被引：0

作者：

Chen, Zixian ^{[1
]}

Wang, Weiping ^{[1
]}

Qin, Yan ^{[1
]}

Zhang, Shigeng ^{[1
]}

机构：

[1] Cent South Univ, Sch Comp Sci & Engn, Changsha, Peoples R China

来源：

COMPUTERS & SECURITY | 2025年 / 153卷

关键词：

Malicious [!text type='java']java[!/text]Script; Malware detection; Obfuscated code; Sequence compression; Deep learning;

D O I：

10.1016/j.cose.2025.104390

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

JavaScript is a key component of websites and greatly enhances web page functionality. At the same time, it has become one of the most common attack vectors in malicious web pages. Early approaches to detecting malicious scripts relied heavily on manual feature engineering by security experts, with limited feature representation capabilities. With the advancements in deep learning technologies, deep learning networks have shown the ability to automatically learn strong feature representations from malicious JavaScript. Presently, mainstream detection methods usually extract the Abstract Syntax Tree (AST) from JavaScript code, which captures the code's semantic information. The information about AST nodes is then processed into a sequence using depth-first traversal and fed into deep learning models. However, for large JavaScript library files and obfuscated JavaScript code, the computational power and hardware constraints pose challenges in feeding complete information into the model. Only apart of the sequence is sampled for training and detection, significantly diminishing the model's detection capability. To address this, this paper proposes an innovative method for malicious JavaScript detection based on sequence compression. The approach extracts input sequences comprised solely of AST node type information and employs a compression algorithm to reduce their length further. Technically, we first extract the information of the type field in each node in the AST in the order of depth-first traversal to generate the sequence, and then effectively compress the sequence using Byte Pair Encoding. Finally, the compressed sequence is fed into the deep learning model for detection. On publicly available datasets, when employing the same deep learning model for classification, our proposed method outperforms existing other approaches, achieving a precision of 98.96% and a recall of 96.37%.

引用

页数：13

共 30 条

[1] Detection of Obfuscated Malicious Java']JavaScript Code
Alazab, Ammar
Khraisat, Ansam
Alazab, Moutaz
Singh, Sarabjot
FUTURE INTERNET, 2022, 14 (08):
[2] Malicious Java']JavaScript Detection Based on Bidirectional LSTM Model
Song, Xuyan
Chen, Chen
Cui, Baojiang
Fu, Junsong
APPLIED SCIENCES-BASEL, 2020, 10 (10):
[3] Detection of malicious java']javascript on an imbalanced dataset
Phung, Ngoc Minh
Mimura, Mamoru
INTERNET OF THINGS, 2021, 13
[4] Static Detection of Malicious Java']JavaScript-Bearing PDF Documents
Laskov, Pavel
Srndic, Nedim
27TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE (ACSAC 2011), 2011, : 373 - 382
[5] ScriptNet: Neural Static Analysis for Malicious Java']JavaScript Detection
Stokes, Jack W.
Agrawal, Rakshit
McDonald, Geoff
Hausknech, Matthew
MILCOM 2019 - 2019 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM), 2019,
[6] Detection and Mitigation Of Malicious Java']JavaScript Using Information Flow Control
Sayed, Bassam
Traore, Issa
Abdelhalim, Amany
2014 TWELFTH ANNUAL INTERNATIONAL CONFERENCE ON PRIVACY, SECURITY AND TRUST (PST), 2014, : 264 - 273
[7] Analysis and Identification of Malicious Java']JavaScript Code
Fraiwan, Mohammad
Al-Salman, Rami
Khasawneh, Natheer
Conrad, Stefan
INFORMATION SECURITY JOURNAL, 2012, 21 (01): : 1 - 11
[8] JS']JSTAP: A Static Pre-Filter for Malicious Java']JavaScript Detection
Fass, Aurore
Backes, Michael
Stock, Ben
35TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE (ACSA), 2019, : 257 - 269
[9] A Half-Dynamic Classification Method on Obfuscated Malicious Java']JavaScript Detection
Fang, Zhaolin
Zhu, Renhuan
Zhang, Weihui
Chen, Bo
INTERNATIONAL JOURNAL OF SECURITY AND ITS APPLICATIONS, 2015, 9 (06): : 251 - 262
[10] Deep Neural Networks for Malicious Java']JavaScript Detection Using Bytecode Sequences
Rozi, Muhammad Fakhrur
Kim, Sangwook
Ozawa, Seiichi
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,

← 1 2 3 →