CADefender: Detection of unknown malicious AutoLISP computer-aided design files using designated feature extraction and machine learning methods

被引：0

作者：

Yevsikov, Alexander ^{[1
,2
]}

Muralidharan, Trivikram ^{[1
,2
]}

Panker, Tomer ^{[1
,2
]}

Nissim, Nir ^{[1
,2
]}

机构：

[1] Ben Gurion Univ Negev, Cyber Secur Res Ctr, Malware Lab, IL-8470912 Beer Sheva, Israel

[2] Ben Gurion Univ Negev, Dept Ind Engn & Management, IL-8410501 Beer Sheva, Israel

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2024年 / 138卷

关键词：

Computer-aided design; Auto list processing; Machine learning; Malware detection; Feature extraction; MALWARE DETECTION; CLASSIFICATION;

D O I：

10.1016/j.engappai.2024.109414

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Computer-aided design (CAD) files are used to create digital designs for various structures - from the smallest chips in the high-tech industry to large-scale buildings and bridges in the civil engineering space. We found that most exploits and malicious payloads are deployed through Auto List Processing (AutoLISP) source code (LSP) or Fast Load AutoLISP (FAS) files, which are non-executable files (NEFs) containing scripts in the AutoLISP language that are native to AutoCAD; While antivirus software is capable of detecting many malicious CAD files, the potential to improve protection by using a dedicated machine learning (ML) based detection solution remains, especially against unknown and sophisticated CAD malware. In this study, we are the first to propose designated feature extraction methods and a robust framework aimed at the detection of known and unknown AutoLISP malware using ML algorithms. To accomplish this, we examined the structure, functionality, and ecosystems of AutoLISP files and collected the largest known representative collection of LSP files consisting of 6418 malicious and benign files (labeled and verified). We then explored the use of two novel static-analysis-based feature extraction methods (knowledge-based and structural) designated for LSP files to extract a discriminative set of informative features, which can subsequently be used by ML models to detect malicious LSP files. These two feature extraction methods serve as the basis of the proposed detection framework, whose performance we comprehensively compare to both widely used antiviruses and baseline ML models based on existing feature extraction methods, including MinHash, Bidirectional Encoder Representations from Transformers (BERT), and n-gram. Our results highlight our methods' contributions to the detection of unknown AutoLISP malware and demonstrate their ability to outperform existing methods. The best performance in the task of unknown malicious LSP file detection was obtained by the Artificial Neural Networks (ANN) model trained on 100 knowledgebased features, which obtained a true positive rate (TPR) of 99.49% with a false positive rate (FPR) of 0.57%. Our framework's role in explainability is also highlighted, as we also present the prominent features that contribute most to the model's detection capabilities; this information can be used for explainability purposes. We conclude by evaluating the proposed framework's ability to detect a malicious file from an unknown AutoLISP malware family and by evaluating our framework on an additional independent test set that originated from another source, scenarios that are often faced by malware detection solutions.

引用

页数：25

共 50 条

[31] Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples
Li, Ming
Zhou, Zhi-Hua
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2007, 37 (06): : 1088 - 1098
[32] A machine learning based computer-aided molecular design/screening methodology for fragrance molecules
Zhang, Lei
Mao, Haitao
Liu, Linlin
Du, Jian
Gani, Rafiqul
COMPUTERS & CHEMICAL ENGINEERING, 2018, 115 : 295 - 308
[33] A framework for computer-aided high performance titanium alloy design based on machine learning
An, Suyang
Li, Kun
Zhu, Liang
Liang, Haisong
Ma, Ruijin
Liao, Ruobing
Murr, Lawrence E.
FRONTIERS IN MATERIALS, 2024, 11
[34] Liking Prediction Using fNIRS and Machine Learning: Comparison of Feature Extraction Methods
Koksal, Mehmet Yigit
Cakar, Tuna
Tuna, Esin
Girisken, Yener
2022 30TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2022,
[35] Leveraging malicious behavior traces from volatile memory using machine learning methods for trusted unknown malware detection in Linux cloud environments
Panker, Tomer
Nissim, Nir
KNOWLEDGE-BASED SYSTEMS, 2021, 226
[36] Brake Disc Deformation Detection Using Intuitive Feature Extraction and Machine Learning
Dozsa, Tamas
Ori, Peter
Szabari, Matyas
Simonyi, Erno
Soumelidis, Alexandros
Lakatos, Istvan
MACHINES, 2024, 12 (04)
[37] Machine learning techniques for pulmonary nodule computer-aided diagnosis using CT images: A systematic review
Jin, Haizhe
Yu, Cheng
Gong, Zibo
Zheng, Renjie
Zhao, Yinan
Fu, Quanwei
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 79
[38] Computer-Aided Detection of Respiratory Sounds in Bronchial Asthma Patients Based on Machine Learning Method
Gelman, A.
Furman, E. G.
Kalinina, N. M.
Malinin, S. V.
Furman, G. B.
Sheludko, V. S.
Sokolovsky, V. L.
SOVREMENNYE TEHNOLOGII V MEDICINE, 2022, 14 (05) : 45 - 51
[39] A Computer-aided diagnosis system for classifying prominent skin lesions using machine learning
Hameed, Nazia
Shabut, Antesar
Hossain, M. A.
2018 10TH COMPUTER SCIENCE AND ELECTRONIC ENGINEERING CONFERENCE (CEEC), 2018, : 186 - 191
[40] The Role and Impact of Deep Learning Methods in Computer-Aided Diagnosis Using Gastrointestinal Endoscopy
Pang, Xuejiao
Zhao, Zijian
Weng, Ying
DIAGNOSTICS, 2021, 11 (04)

← 1 2 3 4 5 →