Mecha: A Neural-Symbolic Open-Set Homogeneous Decision Fusion Approach for Zero-Day Malware Similarity Detection

被引：0

作者：

Molloy, Christopher ^{[1
]}

Banks, Jeremy ^{[1
]}

Ding, Steven H. H. ^{[1
]}

Alaca, Furkan ^{[1
]}

Charland, Philippe ^{[2
]}

Walenstein, Andrew ^{[3
]}

机构：

[1] Queens Univ, Sch Comp, Kingston, ON K7L2N8, Canada

[2] Def R&D Canada Valcartier, Mission Crit Cyber Secur Sect, Quebec City, PQ G0A 4Z0, Canada

[3] BlackBerry, Secur Res & Dev, Waterloo, ON N2K 0A7, Canada

来源：

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING | 2025年 / 51卷 / 02期

关键词：

Malware; Training; Codes; Accuracy; Neural networks; Visualization; Threat modeling; Research and development; Reinforcement learning; Optimization; Deep learning; reinforcement learning; cybersecurity; malware analysis; SMALL WORLD;

D O I：

10.1109/TSE.2025.3531210

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

With increasing numbers of novel malware each year, tools are required for efficient and accurate variant matching under the same family, for the purpose of effective proactive threat detection, retro-hunting, and attack campaign tracking. All of the state-of-the-art Deep Learning (DL) approaches assume that the incoming samples originate from known families and incorrectly identify novel families. Additionally, most of the existing solutions that leverage the Siamese Neural Network architecture either rely on pair-wise comparisons or computationally expensive preprocessing steps that are not scalable to a real-world malware triage volume requirement. We propose a different route, Mecha, a Neural-Symbolic Machine Learning (ML) system for malware variant matching and zero-day family detection. Mecha is comprised of an embedding network trained in two different scenarios for byte string embedding and an open-set approximate nearest neighbour algorithm for variant matching and zero-day detection. Our embedding network uses triplet loss for embedding generation and reinforcement-based Expectation Maximization (EM) learning for full deployment optimization. We conduct multiple in-sample and out-of-sample experiments to demonstrate the model's generalizability toward novel variants and families. We also show that Mecha can detect samples outside the known set of malware samples with an accuracy greater than 0.990.

引用

页码：621 / 637

页数：17

共 7 条

[1] Adversarial Variational Modality Reconstruction and Regularization for Zero-Day Malware Variants Similarity Detection
Molloy, Christopher
Banks, Jeremy
Ding, Steven H. H.
Charland, Philippe
Walenstein, Andrew
Li, Litao
2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, : 1131 - 1136
[2] Zero-X: A Blockchain-Enabled Open-Set Federated Learning Framework for Zero-Day Attack Detection in IoV
Korba, Abdelaziz Amara
Boualouache, Abdelwahab
Ghamri-Doudane, Yacine
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (09) : 12399 - 12414
[3] A Reinforcement Learning-Based Approach for Detection Zero-Day Malware Attacks on IoT System
Ngo, Quoc-Dung
Nguyen, Quoc-Huu
ARTIFICIAL INTELLIGENCE TRENDS IN SYSTEMS, VOL 2, 2022, 502 : 381 - 394
[4] Zero-Day Malware Detection and Effective Malware Analysis Using Shapley Ensemble Boosting and Bagging Approach
Kumar, Rajesh
Subbiah, Geetha
SENSORS, 2022, 22 (07)
[5] Deep Neural Network and Transfer Learning for Accurate Hardware-Based Zero-Day Malware Detection
He, Zhangying
Rezaei, Amin
Homayoun, Houman
Sayadi, Hossein
PROCEEDINGS OF THE 32ND GREAT LAKES SYMPOSIUM ON VLSI 2022, GLSVLSI 2022, 2022, : 27 - 32
[6] Zero-Day Aware Decision Fusion-Based Model for Crypto-Ransomware Early Detection
Al-rimy, Bander Ali Saleh
Maarof, Mohd Aizaini
Prasetyo, Yuli Adam
Shaid, Syed Zainudeen Mohd
Ariffin, Aswami Fadillah Mohd
INTERNATIONAL JOURNAL OF INTEGRATED ENGINEERING, 2018, 10 (06): : 82 - 88
[7] Breakthrough to Adaptive and Cost-Aware Hardware-Assisted Zero-Day Malware Detection: A Reinforcement Learning-Based Approach
He, Zhangying
Makrani, Hosein Mohammadi
Rafatirad, Setareh
Homayoun, Houman
Sayadi, Hossein
2022 IEEE 40TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2022), 2022, : 231 - 238

← 1 →