A Universal Malicious Documents Static Detection Framework Based on Feature Generalization

被引:10
|
作者
Lu, Xiaofeng [1 ]
Wang, Fei [1 ]
Jiang, Cheng [1 ]
Lio, Pietro [2 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Cyberspace Secur, Beijing 100876, Peoples R China
[2] Univ Cambridge, Comp Lab, Cambridge CB3 0FD, England
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 24期
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
malicious document detection; static detection; feature generalization; machine learning;
D O I
10.3390/app112412134
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In this study, Portable Document Format (PDF), Word, Excel, Rich Test format (RTF) and image documents are taken as the research objects to study a static and fast method by which to detect malicious documents. Malicious PDF and Word document features are abstracted and extended, which can be used to detect other types of documents. A universal static detection framework for malicious documents based on feature generalization is then proposed. The generalized features include specification check errors, the structure path, code keywords, and the number of objects. The proposed method is verified on two datasets, and is compared with Kaspersky, NOD32, and McAfee antivirus software. The experimental results demonstrate that the proposed method achieves good performance in terms of the detection accuracy, runtime, and scalability. The average F1-score of all types of documents is found to be 0.99, and the average detection time of a document is 0.5926 s, which is at the same level as the compared antivirus software.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] An Effective Feature Selection Algorithm for Machine Learning-based Malicious Traffic Detection
    Fei, Chao
    Xia, Nian
    Tsai, Pang-Wei
    Lu, Yang
    Pan, Xiaonan
    Gong, Junli
    2024 19TH ASIA JOINT CONFERENCE ON INFORMATION SECURITY, ASIAJCIS 2024, 2024, : 91 - 98
  • [22] A combined feature selection approach for malicious email detection based on a comprehensive email dataset
    Zhang, Han
    Shi, Yong
    Liu, Ming
    Chen, Libo
    Wu, Songyang
    Xue, Zhi
    CYBERSECURITY, 2025, 8 (01):
  • [23] Boosting the Detection of Malicious Documents Using Designated Active Learning Methods
    Nissim, Nir
    Cohen, Aviad
    Elovici, Yuval
    2015 IEEE 14TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2015, : 760 - 765
  • [24] Static Detection of Malicious Code in Programs Using Semantic Techniques
    Navid, Syed Zami-Ul-Haque
    Dey, Protik
    Hasan, Shamiul
    Ali, Muhammad Masroor
    PROCEEDINGS OF 2020 11TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (ICECE), 2020, : 327 - 330
  • [25] A Framework for Malicious Traffic Detection in IoT Healthcare Environment
    Hussain, Faisal
    Abbas, Syed Ghazanfar
    Shah, Ghalib A.
    Pires, Ivan Miguel
    Fayyaz, Ubaid U.
    Shahzad, Farrukh
    Garcia, Nuno M.
    Zdravevski, Eftim
    SENSORS, 2021, 21 (09)
  • [26] Feature optimization and hybrid classification for malicious web page detection
    Deng, Weiping
    Peng, Yan
    Yang, Fan
    Song, Jun
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (16):
  • [27] Performance evaluations of AI-based obfuscated and encrypted malicious script detection with feature optimization
    Kim, Kookjin
    Shin, Jisoo
    Park, Jong-Geun
    Kim, Jung-Tae
    ETRI JOURNAL, 2024,
  • [28] A Learning-based Static Malware Detection System with Integrated Feature
    Chen, Zhiguo
    Zhang, Xiaorui
    Kim, Sungryul
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2021, 27 (03): : 891 - 908
  • [29] AMA: Static Code Analysis of Web Page For The Detection of Malicious Scripts
    Seshagiri, Prabhu
    Vazhayil, Anu
    Sriram, Padmamala
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING AND COMMUNICATIONS, 2016, 93 : 768 - 773
  • [30] A Heterogeneous Machine Learning Ensemble Framework for Malicious Webpage Detection
    Shin, Sam-Shin
    Ji, Seung-Goo
    Hong, Sung-Sam
    APPLIED SCIENCES-BASEL, 2022, 12 (23):