ZipAST: Enhancing malicious Java']JavaScript detection with sequence compression

被引:0
|
作者
Chen, Zixian [1 ]
Wang, Weiping [1 ]
Qin, Yan [1 ]
Zhang, Shigeng [1 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, Changsha, Peoples R China
关键词
Malicious [!text type='java']java[!/text]Script; Malware detection; Obfuscated code; Sequence compression; Deep learning;
D O I
10.1016/j.cose.2025.104390
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
JavaScript is a key component of websites and greatly enhances web page functionality. At the same time, it has become one of the most common attack vectors in malicious web pages. Early approaches to detecting malicious scripts relied heavily on manual feature engineering by security experts, with limited feature representation capabilities. With the advancements in deep learning technologies, deep learning networks have shown the ability to automatically learn strong feature representations from malicious JavaScript. Presently, mainstream detection methods usually extract the Abstract Syntax Tree (AST) from JavaScript code, which captures the code's semantic information. The information about AST nodes is then processed into a sequence using depth-first traversal and fed into deep learning models. However, for large JavaScript library files and obfuscated JavaScript code, the computational power and hardware constraints pose challenges in feeding complete information into the model. Only apart of the sequence is sampled for training and detection, significantly diminishing the model's detection capability. To address this, this paper proposes an innovative method for malicious JavaScript detection based on sequence compression. The approach extracts input sequences comprised solely of AST node type information and employs a compression algorithm to reduce their length further. Technically, we first extract the information of the type field in each node in the AST in the order of depth-first traversal to generate the sequence, and then effectively compress the sequence using Byte Pair Encoding. Finally, the compressed sequence is fed into the deep learning model for detection. On publicly available datasets, when employing the same deep learning model for classification, our proposed method outperforms existing other approaches, achieving a precision of 98.96% and a recall of 96.37%.
引用
收藏
页数:13
相关论文
共 30 条
  • [1] Detection of Obfuscated Malicious Java']JavaScript Code
    Alazab, Ammar
    Khraisat, Ansam
    Alazab, Moutaz
    Singh, Sarabjot
    FUTURE INTERNET, 2022, 14 (08):
  • [2] Malicious Java']JavaScript Detection Based on Bidirectional LSTM Model
    Song, Xuyan
    Chen, Chen
    Cui, Baojiang
    Fu, Junsong
    APPLIED SCIENCES-BASEL, 2020, 10 (10):
  • [3] Detection of malicious java']javascript on an imbalanced dataset
    Phung, Ngoc Minh
    Mimura, Mamoru
    INTERNET OF THINGS, 2021, 13
  • [4] Static Detection of Malicious Java']JavaScript-Bearing PDF Documents
    Laskov, Pavel
    Srndic, Nedim
    27TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE (ACSAC 2011), 2011, : 373 - 382
  • [5] ScriptNet: Neural Static Analysis for Malicious Java']JavaScript Detection
    Stokes, Jack W.
    Agrawal, Rakshit
    McDonald, Geoff
    Hausknech, Matthew
    MILCOM 2019 - 2019 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM), 2019,
  • [6] Detection and Mitigation Of Malicious Java']JavaScript Using Information Flow Control
    Sayed, Bassam
    Traore, Issa
    Abdelhalim, Amany
    2014 TWELFTH ANNUAL INTERNATIONAL CONFERENCE ON PRIVACY, SECURITY AND TRUST (PST), 2014, : 264 - 273
  • [7] Analysis and Identification of Malicious Java']JavaScript Code
    Fraiwan, Mohammad
    Al-Salman, Rami
    Khasawneh, Natheer
    Conrad, Stefan
    INFORMATION SECURITY JOURNAL, 2012, 21 (01): : 1 - 11
  • [8] JS']JSTAP: A Static Pre-Filter for Malicious Java']JavaScript Detection
    Fass, Aurore
    Backes, Michael
    Stock, Ben
    35TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE (ACSA), 2019, : 257 - 269
  • [9] A Half-Dynamic Classification Method on Obfuscated Malicious Java']JavaScript Detection
    Fang, Zhaolin
    Zhu, Renhuan
    Zhang, Weihui
    Chen, Bo
    INTERNATIONAL JOURNAL OF SECURITY AND ITS APPLICATIONS, 2015, 9 (06): : 251 - 262
  • [10] Deep Neural Networks for Malicious Java']JavaScript Detection Using Bytecode Sequences
    Rozi, Muhammad Fakhrur
    Kim, Sangwook
    Ozawa, Seiichi
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,