ZipAST: Enhancing malicious Java']JavaScript detection with sequence compression

被引：0

作者：

Chen, Zixian ^{[1
]}

Wang, Weiping ^{[1
]}

Qin, Yan ^{[1
]}

Zhang, Shigeng ^{[1
]}

机构：

[1] Cent South Univ, Sch Comp Sci & Engn, Changsha, Peoples R China

来源：

COMPUTERS & SECURITY | 2025年 / 153卷

关键词：

Malicious [!text type='java']java[!/text]Script; Malware detection; Obfuscated code; Sequence compression; Deep learning;

D O I：

10.1016/j.cose.2025.104390

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

JavaScript is a key component of websites and greatly enhances web page functionality. At the same time, it has become one of the most common attack vectors in malicious web pages. Early approaches to detecting malicious scripts relied heavily on manual feature engineering by security experts, with limited feature representation capabilities. With the advancements in deep learning technologies, deep learning networks have shown the ability to automatically learn strong feature representations from malicious JavaScript. Presently, mainstream detection methods usually extract the Abstract Syntax Tree (AST) from JavaScript code, which captures the code's semantic information. The information about AST nodes is then processed into a sequence using depth-first traversal and fed into deep learning models. However, for large JavaScript library files and obfuscated JavaScript code, the computational power and hardware constraints pose challenges in feeding complete information into the model. Only apart of the sequence is sampled for training and detection, significantly diminishing the model's detection capability. To address this, this paper proposes an innovative method for malicious JavaScript detection based on sequence compression. The approach extracts input sequences comprised solely of AST node type information and employs a compression algorithm to reduce their length further. Technically, we first extract the information of the type field in each node in the AST in the order of depth-first traversal to generate the sequence, and then effectively compress the sequence using Byte Pair Encoding. Finally, the compressed sequence is fed into the deep learning model for detection. On publicly available datasets, when employing the same deep learning model for classification, our proposed method outperforms existing other approaches, achieving a precision of 98.96% and a recall of 96.37%.

引用

页数：13

共 30 条

[11] HIDENOSEEK: Camouflaging Malicious Java']JavaScript in Benign ASTs
Fass, Aurore
Backes, Michael
Stock, Ben
PROCEEDINGS OF THE 2019 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (CCS'19), 2019, : 1899 - 1913
[12] A deep learning approach for detecting malicious Java']JavaScript code
Wang, Yao
Cai, Wan-dong
Wei, Peng-cheng
SECURITY AND COMMUNICATION NETWORKS, 2016, 9 (11) : 1520 - 1534
[13] Detecting Malicious Java']Javascript in PDF through Document Instrumentation
Liu, Daiping
Wang, Haining
Stavrou, Angelos
2014 44TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN), 2014, : 100 - 111
[14] JS']JStrong: Malicious Java']JavaScript detection based on code semantic representation and graph neural network
Fang, Yong
Huang, Chaoyi
Zeng, Minchuan
Zhao, Zhiying
Huang, Cheng
COMPUTERS & SECURITY, 2022, 118
[15] JS']JSRevealer: A Robust Malicious Java']JavaScript Detector against Obfuscation
Ren, Kunlun
Qiang, Weizhong
Wu, Yueming
Zhou, Yi
Zou, Deqing
Jin, Hai
2023 53RD ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, DSN, 2023, : 339 - 351
[16] Towards the Detection of Malicious Java']Java Packages
Ladisa, Piergiorgio
Plate, Henrik
Martinez, Matias
Barais, Olivier
Ponta, Serena Elisa
PROCEEDINGS OF THE 2022 ACM WORKSHOP ON SOFTWARE SUPPLY CHAIN OFFENSIVE RESEARCH AND ECOSYSTEM DEFENSES, SCORED 2022, 2022, : 63 - 72
[17] MOJI: Character-level convolutional neural networks for Malicious Obfuscated Java']JavaScript Inspection
Ishida, Minato
Kaneko, Naoshi
Sumi, Kazuhiko
APPLIED SOFT COMPUTING, 2023, 137
[18] JS']JSDES - An Automated De-Obfuscation System for Malicious Java']JavaScript
AbdelKhalek, Moataz
Shosha, Ahmed
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY (ARES 2017), 2017,
[19] On improvements of robustness of obfuscated Java']JavaScript code detection
Ponomarenko, G. S.
Klyucharev, P. G.
JOURNAL OF COMPUTER VIROLOGY AND HACKING TECHNIQUES, 2023, 19 (03) : 387 - 398
[20] JS']JSAC: A Novel Framework to Detect Malicious Java']JavaScript via CNNs over AST and CFG
Jiang, Hongliang
Yang, Yuxing
Sun, Lu
Jiang, Lin
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,

← 1 2 3 →