Classification and Prediction of Significant Cyber Incidents (SCI) Using Data Mining and Machine Learning (DM-ML)

被引:3
作者
Mumtaz, Gohar [1 ,2 ]
Akram, Sheeraz [1 ,2 ,3 ]
Iqbal, Muhammad Waseem [4 ]
Ashraf, M. Usman [5 ]
Almarhabi, Khalid Ali [6 ]
Alghamdi, Ahmed Mohammed [7 ]
Bahaddad, Adel A. [8 ]
机构
[1] Super Univ, Fac Comp Sci & Informat Technol, Lahore 54000, Pakistan
[2] Intelligent Data Visual Comp Res IDVCR, Lahore 73861, Pakistan
[3] Imam Mohammad Ibn Saud Islamic Univ IMSIU, Coll Comp & Informat Sci, Riyadh 11564, Saudi Arabia
[4] Super Univ, Fac Comp Sci & Informat Technol, Dept Software Engn, Lahore 54000, Pakistan
[5] GC Women Univ, Dept Comp Sci, Sialkot 51310, Pakistan
[6] Umm Al Qura Univ, Coll Comp Al Qunfudah, Dept Comp Sci, Mecca 21421, Saudi Arabia
[7] Univ Jeddah, Coll Comp Sci & Engn, Dept Software Engn, Jeddah 21493, Saudi Arabia
[8] King Abdulaziz Univ, Dept Informat Syst, Jeddah 21589, Saudi Arabia
来源
IEEE ACCESS | 2023年 / 11卷
关键词
Significant cyber incidents; cyber security; data mining; machine learning; SECURITY; ATTACKS;
D O I
10.1109/ACCESS.2023.3249663
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The rapid growth in technology and several IoT devices make cyberspace unsecure and eventually lead to Significant Cyber Incidents (SCI). Cyber Security is a technique that protects systems over the internet from SCI. Data Mining and Machine Learning (DM-ML) play an important role in Cyber Security in the prediction, prevention, and detection of SCI. This study sheds light on the importance of Cyber Security as well as the impact of COVID-19 on cyber security. The dataset (SCI as per the report of the Center for Strategic and International Studies (CSIS)) is divided into two subsets (pre-pandemic SCI and post-pandemic SCI). Data Mining (DM) techniques are used for feature extraction and well know ML classifiers such as Naive Bayes (NB), Support Vector Machine (SVM), Logistic Regression (LR) and Random Forest (RF) for classification. A centralized classifier approach is used to maintain a single centralized dataset by taking inputs from six continents of the world. The results of the pre-pandemic and post-pandemic datasets are compared and finally conclude this paper with better accuracy and the prediction of which type of SCI can occur in which part of the world. It is concluded that SVM and RF are much better classifiers than others and Asia is predicted to be the most affected continent by SCI.
引用
收藏
页码:94486 / 94496
页数:11
相关论文
共 44 条
  • [1] Abspoel Mark, 2020, Secure training of decision trees with continuous attributes
  • [2] Ali S., 2017, Sindh University Research Journal -Science Series, V49, P125
  • [3] [Anonymous], 2013, P 2013 ACM WORKSH AR
  • [4] A Comprehensive Review on Malware Detection Approaches
    Aslan, Omer
    Samet, Refik
    [J]. IEEE ACCESS, 2020, 8 : 6249 - 6271
  • [5] Bapat Rohan, 2018, 2018 Systems and Information Engineering Design Symposium (SIEDS), P266, DOI 10.1109/SIEDS.2018.8374749
  • [6] Bertl M., 2019, YOUNG INFORM SCI, V4, P1, DOI DOI 10.25365/YIS-2019-4-1
  • [7] Detection of Cyber Attacks on Voltage Regulation in Distribution Systems Using Machine Learning
    Bhusal, Narayan
    Gautam, Mukesh
    Benidris, Mohammed
    [J]. IEEE ACCESS, 2021, 9 : 40402 - 40416
  • [8] Cyber-attack method and perpetrator prediction using machine learning algorithms
    Bilen, Abdulkadir
    Ozer, Ahmet Bedri
    [J]. PEERJ COMPUTER SCIENCE, 2021,
  • [9] A text-mining based cyber-risk assessment and mitigation framework for critical analysis of online hacker forums
    Biswas, Baidyanath
    Mukhopadhyay, Arunabha
    Bhattacharjee, Sudip
    Kumar, Ajay
    Delen, Dursun
    [J]. DECISION SUPPORT SYSTEMS, 2020, 152
  • [10] A comprehensive survey on support vector machine classification: Applications, challenges and trends
    Cervantes, Jair
    Garcia-Lamont, Farid
    Rodriguez-Mazahua, Lisbeth
    Lopez, Asdrubal
    [J]. NEUROCOMPUTING, 2020, 408 : 189 - 215