CADefender: Detection of unknown malicious AutoLISP computer-aided design files using designated feature extraction and machine learning methods

被引:0
作者
Yevsikov, Alexander [1 ,2 ]
Muralidharan, Trivikram [1 ,2 ]
Panker, Tomer [1 ,2 ]
Nissim, Nir [1 ,2 ]
机构
[1] Ben Gurion Univ Negev, Cyber Secur Res Ctr, Malware Lab, IL-8470912 Beer Sheva, Israel
[2] Ben Gurion Univ Negev, Dept Ind Engn & Management, IL-8410501 Beer Sheva, Israel
关键词
Computer-aided design; Auto list processing; Machine learning; Malware detection; Feature extraction; MALWARE DETECTION; CLASSIFICATION;
D O I
10.1016/j.engappai.2024.109414
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Computer-aided design (CAD) files are used to create digital designs for various structures - from the smallest chips in the high-tech industry to large-scale buildings and bridges in the civil engineering space. We found that most exploits and malicious payloads are deployed through Auto List Processing (AutoLISP) source code (LSP) or Fast Load AutoLISP (FAS) files, which are non-executable files (NEFs) containing scripts in the AutoLISP language that are native to AutoCAD; While antivirus software is capable of detecting many malicious CAD files, the potential to improve protection by using a dedicated machine learning (ML) based detection solution remains, especially against unknown and sophisticated CAD malware. In this study, we are the first to propose designated feature extraction methods and a robust framework aimed at the detection of known and unknown AutoLISP malware using ML algorithms. To accomplish this, we examined the structure, functionality, and ecosystems of AutoLISP files and collected the largest known representative collection of LSP files consisting of 6418 malicious and benign files (labeled and verified). We then explored the use of two novel static-analysis-based feature extraction methods (knowledge-based and structural) designated for LSP files to extract a discriminative set of informative features, which can subsequently be used by ML models to detect malicious LSP files. These two feature extraction methods serve as the basis of the proposed detection framework, whose performance we comprehensively compare to both widely used antiviruses and baseline ML models based on existing feature extraction methods, including MinHash, Bidirectional Encoder Representations from Transformers (BERT), and n-gram. Our results highlight our methods' contributions to the detection of unknown AutoLISP malware and demonstrate their ability to outperform existing methods. The best performance in the task of unknown malicious LSP file detection was obtained by the Artificial Neural Networks (ANN) model trained on 100 knowledgebased features, which obtained a true positive rate (TPR) of 99.49% with a false positive rate (FPR) of 0.57%. Our framework's role in explainability is also highlighted, as we also present the prominent features that contribute most to the model's detection capabilities; this information can be used for explainability purposes. We conclude by evaluating the proposed framework's ability to detect a malicious file from an unknown AutoLISP malware family and by evaluating our framework on an additional independent test set that originated from another source, scenarios that are often faced by malware detection solutions.
引用
收藏
页数:25
相关论文
共 50 条
  • [41] Novel set of general descriptive features for enhanced detection of malicious emails using machine learning methods
    Cohen, Aviad
    Nissim, Nir
    Elovici, Yuval
    EXPERT SYSTEMS WITH APPLICATIONS, 2018, 110 : 143 - 169
  • [42] Feature Extraction Evaluation of Various Machine Learning Methods for Finger Movement Classification using Double Myo Armband
    Anam, Khairul
    Ismail, Harun
    Hanggara, Faruq S.
    Avian, Cries
    Nahela, Safri
    Sasono, Muchamad Arif Hana
    JOURNAL OF ENGINEERING AND TECHNOLOGICAL SCIENCES, 2023, 55 (05): : 587 - 599
  • [43] Feature identification for parameter extraction and defect detection using machine learning
    Guo, Y.
    Pahlavani, H.
    Khachaturiants, A.
    Elsayed, K.
    van de Laar, J.
    Simons, E.
    Saikumar, N.
    Sadeghian, H.
    METROLOGY, INSPECTION, AND PROCESS CONTROL XXXVIII, 2024, 12955
  • [44] Recent Advances in Computer-Aided Medical Diagnosis Using Machine Learning Algorithms With Optimization Techniques
    Rafi, Taki Hasan
    Shubair, Raed M.
    Farhan, Faisal
    Hoque, Md Ziaul
    Quayyum, Farhan Mohd
    IEEE ACCESS, 2021, 9 : 137847 - 137868
  • [45] Lung disease detection using feature extraction and extreme learning machine
    Ramalho, Geraldo Luis Bezerra, 1600, Sociedade Brasileira de Engenharia Biomedica, Caixa Postal 68510, Rio de Janeiro, RJ, 21941-972, Brazil (30): : 207 - 214
  • [46] Three decades of machine learning with neural networks in computer-aided architectural design (1990-2021)
    Rhee, Jinmo
    Veloso, Pedro
    Krishnamurti, Ramesh
    DESIGN SCIENCE, 2023, 9
  • [47] Feature Extraction Methods for Binary Code Similarity Detection Using Neural Machine Translation Models
    Ito, Norimitsu
    Hashimoto, Masaki
    Otsuka, Akira
    IEEE ACCESS, 2023, 11 : 102796 - 102805
  • [48] Computer-Aided Classification of Breast Lesions Based on US RF Time Series Using a Novel Machine Learning Approach
    Arab, Mahsa
    Fallah, Ali
    Rashidi, Saeid
    Dastjerdi, Maryam Mehdizadeh
    Ahmadinejad, Nasrin
    JOURNAL OF ULTRASOUND IN MEDICINE, 2024, 43 (11) : 2129 - 2145
  • [49] Using ResNet feature extraction in computer-aided diagnosis of breast cancer on 927 lesions imaged with multiparametric MRI
    Hu, Qiyuan
    Whitney, Heather M.
    Giger, Maryellen L.
    MEDICAL IMAGING 2020: COMPUTER-AIDED DIAGNOSIS, 2020, 11314
  • [50] Computer-Aided Diagnosis of Anterior Segment Eye Abnormalities using Visible Wavelength Image Analysis Based Machine Learning
    Kumar, Mahesh S., V
    Gunasundari, R.
    JOURNAL OF MEDICAL SYSTEMS, 2018, 42 (07)