ZipAST: Enhancing malicious Java']JavaScript detection with sequence compression

被引:0
|
作者
Chen, Zixian [1 ]
Wang, Weiping [1 ]
Qin, Yan [1 ]
Zhang, Shigeng [1 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, Changsha, Peoples R China
关键词
Malicious [!text type='java']java[!/text]Script; Malware detection; Obfuscated code; Sequence compression; Deep learning;
D O I
10.1016/j.cose.2025.104390
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
JavaScript is a key component of websites and greatly enhances web page functionality. At the same time, it has become one of the most common attack vectors in malicious web pages. Early approaches to detecting malicious scripts relied heavily on manual feature engineering by security experts, with limited feature representation capabilities. With the advancements in deep learning technologies, deep learning networks have shown the ability to automatically learn strong feature representations from malicious JavaScript. Presently, mainstream detection methods usually extract the Abstract Syntax Tree (AST) from JavaScript code, which captures the code's semantic information. The information about AST nodes is then processed into a sequence using depth-first traversal and fed into deep learning models. However, for large JavaScript library files and obfuscated JavaScript code, the computational power and hardware constraints pose challenges in feeding complete information into the model. Only apart of the sequence is sampled for training and detection, significantly diminishing the model's detection capability. To address this, this paper proposes an innovative method for malicious JavaScript detection based on sequence compression. The approach extracts input sequences comprised solely of AST node type information and employs a compression algorithm to reduce their length further. Technically, we first extract the information of the type field in each node in the AST in the order of depth-first traversal to generate the sequence, and then effectively compress the sequence using Byte Pair Encoding. Finally, the compressed sequence is fed into the deep learning model for detection. On publicly available datasets, when employing the same deep learning model for classification, our proposed method outperforms existing other approaches, achieving a precision of 98.96% and a recall of 96.37%.
引用
收藏
页数:13
相关论文
共 30 条
  • [11] HIDENOSEEK: Camouflaging Malicious Java']JavaScript in Benign ASTs
    Fass, Aurore
    Backes, Michael
    Stock, Ben
    PROCEEDINGS OF THE 2019 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (CCS'19), 2019, : 1899 - 1913
  • [12] A deep learning approach for detecting malicious Java']JavaScript code
    Wang, Yao
    Cai, Wan-dong
    Wei, Peng-cheng
    SECURITY AND COMMUNICATION NETWORKS, 2016, 9 (11) : 1520 - 1534
  • [13] Detecting Malicious Java']Javascript in PDF through Document Instrumentation
    Liu, Daiping
    Wang, Haining
    Stavrou, Angelos
    2014 44TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN), 2014, : 100 - 111
  • [14] JS']JStrong: Malicious Java']JavaScript detection based on code semantic representation and graph neural network
    Fang, Yong
    Huang, Chaoyi
    Zeng, Minchuan
    Zhao, Zhiying
    Huang, Cheng
    COMPUTERS & SECURITY, 2022, 118
  • [15] JS']JSRevealer: A Robust Malicious Java']JavaScript Detector against Obfuscation
    Ren, Kunlun
    Qiang, Weizhong
    Wu, Yueming
    Zhou, Yi
    Zou, Deqing
    Jin, Hai
    2023 53RD ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, DSN, 2023, : 339 - 351
  • [16] Towards the Detection of Malicious Java']Java Packages
    Ladisa, Piergiorgio
    Plate, Henrik
    Martinez, Matias
    Barais, Olivier
    Ponta, Serena Elisa
    PROCEEDINGS OF THE 2022 ACM WORKSHOP ON SOFTWARE SUPPLY CHAIN OFFENSIVE RESEARCH AND ECOSYSTEM DEFENSES, SCORED 2022, 2022, : 63 - 72
  • [17] MOJI: Character-level convolutional neural networks for Malicious Obfuscated Java']JavaScript Inspection
    Ishida, Minato
    Kaneko, Naoshi
    Sumi, Kazuhiko
    APPLIED SOFT COMPUTING, 2023, 137
  • [18] JS']JSDES - An Automated De-Obfuscation System for Malicious Java']JavaScript
    AbdelKhalek, Moataz
    Shosha, Ahmed
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY (ARES 2017), 2017,
  • [19] On improvements of robustness of obfuscated Java']JavaScript code detection
    Ponomarenko, G. S.
    Klyucharev, P. G.
    JOURNAL OF COMPUTER VIROLOGY AND HACKING TECHNIQUES, 2023, 19 (03) : 387 - 398
  • [20] JS']JSAC: A Novel Framework to Detect Malicious Java']JavaScript via CNNs over AST and CFG
    Jiang, Hongliang
    Yang, Yuxing
    Sun, Lu
    Jiang, Lin
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,