Building an Effective Classifier for Phishing Web Pages Detection: A Quantum-Inspired Biomimetic Paradigm Suitable for Big Data Analytics of Cyber Attacks

被引:6
作者
Darwish, Saad M. [1 ]
Farhan, Dheyauldeen A. [2 ]
Elzoghabi, Adel A. [1 ]
机构
[1] Alexandria Univ, Inst Grad Studies & Res, Dept Informat Technol, POB 832,163 Horreya Ave, Alexandria 21526, Egypt
[2] Al Maarif Univ Coll, Dept Comp Sci, Ramadi, Iraq
关键词
malicious URLs detection; cyber security; big data analytics; biomimetic algorithm; quantum-inspired computing; FEATURE-SELECTION;
D O I
10.3390/biomimetics8020197
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
To combat malicious domains, which serve as a key platform for a wide range of attacks, domain name service (DNS) data provide rich traces of Internet activities and are a powerful resource. This paper presents new research that proposes a model for finding malicious domains by passively analyzing DNS data. The proposed model builds a real-time, accurate, middleweight, and fast classifier by combining a genetic algorithm for selecting DNS data features with a two-step quantum ant colony optimization (QABC) algorithm for classification. The modified two-step QABC classifier uses K-means instead of random initialization to place food sources. In order to overcome ABCs poor exploitation abilities and its convergence speed, this paper utilizes the metaheuristic QABC algorithm for global optimization problems inspired by quantum physics concepts. The use of the Hadoop framework and a hybrid machine learning approach (K-mean and QABC) to deal with the large size of uniform resource locator (URL) data is one of the main contributions of this paper. The major point is that blacklists, heavyweight classifiers (those that use more features), and lightweight classifiers (those that use fewer features and consume the features from the browser) may all be improved with the use of the suggested machine learning method. The results showed that the suggested model could work with more than 96.6% accuracy for more than 10 million query-answer pairs.
引用
收藏
页数:22
相关论文
共 77 条
[1]   From logs to Stories: Human-Centred Data Mining for Cyber Threat Intelligence [J].
Afzaliseresht, Neda ;
Miao, Yuan ;
Michalska, Sandra ;
Liu, Qing ;
Wang, Hua .
IEEE ACCESS, 2020, 8 :19089-19099
[2]   A survey on the Artificial Bee Colony algorithm variants for binary, integer and mixed integer programming problems [J].
Akay, Bahriye ;
Karaboga, Dervis ;
Gorkemli, Beyza ;
Kaya, Ebubekir .
APPLIED SOFT COMPUTING, 2021, 106
[3]  
Alkawaz MH, 2020, 2020 16TH IEEE INTERNATIONAL COLLOQUIUM ON SIGNAL PROCESSING & ITS APPLICATIONS (CSPA 2020), P111, DOI [10.1109/cspa48992.2020.9068728, 10.1109/CSPA48992.2020.9068728]
[4]  
[Anonymous], 2017, P IEEE 7 ANN COMP CO
[5]  
Begum A., 2020, ADV DECISION SCI SEC, P587, DOI [10.1007/978-3-030-24318-0_68, DOI 10.1007/978-3-030-24318-0_68]
[6]   Classification of Phishing Attack Solutions by Employing Deep Learning Techniques: A Systematic Literature Review [J].
Benavides, Eduardo ;
Fuertes, Walter ;
Sanchez, Sandra ;
Sanchez, Manuel .
DEVELOPMENTS AND ADVANCES IN DEFENSE AND SECURITY, 2020, 152 :51-64
[7]   EXPOSURE: A Passive DNS Analysis Service to Detect and Report Malicious Domains [J].
Bilge, Leyla ;
Sen, Sevil ;
Balzarotti, Davide ;
Kirda, Engin ;
Kruegel, Christopher .
ACM TRANSACTIONS ON INFORMATION AND SYSTEM SECURITY, 2014, 16 (04)
[8]  
Bouzoubaa K, 2021, INT J ADV COMPUT SC, V12, P131
[9]   Application of quantum artificial bee colony for energy management by considering the heat and cooling storages [J].
Cai, Wei ;
Vosoogh, Mandi ;
Reinders, Benjamin ;
Toshin, Dmitriy Sergeevich ;
Ebadi, Abdol Ghaffar .
APPLIED THERMAL ENGINEERING, 2019, 157
[10]   A new hybrid ensemble feature selection framework for machine learning-based phishing detection system [J].
Chiew, Kang Leng ;
Tan, Choon Lin ;
Wong, KokSheik ;
Yong, Kelvin S. C. ;
Tiong, Wei King .
INFORMATION SCIENCES, 2019, 484 :153-166