An Email Cyber Threat Intelligence Method Using Domain Ontology and Machine Learning

被引：1

作者：

Venckauskas, Algimantas ^{[1
]}

Toldinas, Jevgenijus ^{[1
]}

Morkevicius, Nerijus ^{[1
]}

Sanfilippo, Filippo ^{[2
]}

机构：

[1] Kaunas Univ Technol, Dept Comp Sci, LT-44249 Kaunas, Lithuania

[2] Univ Agder UiA, Dept Engn Sci, N-4879 Grimstad, Norway

来源：

ELECTRONICS | 2024年 / 13卷 / 14期

关键词：

cyber threat intelligence; email; domain ontology; machine learning;

D O I：

10.3390/electronics13142716

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Email is an excellent technique for connecting users at low cost. Spam emails pose the risk of collecting a user's personal information by fooling them into clicking on a link or engaging in other fraudulent activities. Furthermore, when a spam message is delivered, the user may read the entire message before deciding it is spam and deleting it. Most approaches to email classification proposed by other authors use natural language processing (NLP) methods to analyze the content of email messages. One of the biggest shortcomings of NLP-based methods is their dependence on the language in which a message is written. To construct an effective email cyber threat intelligence (CTI) sharing framework, the privacy of a message's content must be preserved. This article proposes a novel domain-specific ontology and method for emails that require only the metadata of email messages to be shared to preserve their privacy, making them applicable to solutions for sharing email CTI. To preserve privacy, a new semantic parser was developed for the proposed email domain-specific ontology to populate email metadata and create a dataset. Machine learning algorithms were examined, and experiments were conducted to identify and classify spam messages using the newly created dataset. Feature-ranking algorithms, chi-squared, ANOVA (analysis of variance), and Kruskal-Wallis tests were used. In all experiments, the kernel na & iuml;ve Bayes model demonstrated acceptable results. The highest accuracy of 92.28% and an F1 score of 95.92% for recognizing spam email messages were obtained using the proposed domain-specific ontology, the newly developed semantic parser, and the created metadata dataset.

引用

页数：22

共 39 条

[1] Cyber-threat intelligence for security decision-making: A review and research agenda for practice [J].

Ainslie, Scott ;

Thompson, Dean ;

Maynard, Sean ;

Ahmad, Atif .

COMPUTERS & SECURITY, 2023, 132

[2] DSpamOnto: An Ontology Modelling for Domain-Specific Social Spammers in Microblogging [J].

Al-Hassan, Malak ;

Abu-Salih, Bilal ;

Al Hwaitat, Ahmad .

BIG DATA AND COGNITIVE COMPUTING, 2023, 7 (02)

[3]

Al-Sada B, 2023, Arxiv, DOI [arXiv:2308.14016, DOI 10.48550/ARXIV.2308.14016]

[4]

FREED N, 1996, MULTIPURPOSE INTER 1

[5] Detecting Spam Email With Machine Learning Optimized With Bio-Inspired Metaheuristic Algorithms [J].

Gibson, Simran ;

Issac, Biju ;

Zhang, Li ;

Jacob, Seibu Mary .

IEEE ACCESS, 2020, 8 :187914-187932

[6]

Hitzler P., 2009, Foundations of Semantic Web Technologies

[7]

Huang Chiao-Cheng, 2022, 2022 IEEE International Conference on Big Data (Big Data), P4266, DOI 10.1109/BigData55660.2022.10021134

[8] A review of spam email detection: analysis of spammer strategies and the dataset shift problem [J].

Janez-Martino, Francisco ;

Alaiz-Rodriguez, Rocio ;

Gonzalez-Castro, Victor ;

Fidalgo, Eduardo ;

Alegre, Enrique .

ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (02) :1145-1173

[9]

Jeeva L., 2023, Int. J. Innov. Res. Comput. Sci. Technol, V11, P5, DOI [10.55524/ijircst.2023.11.4.2, DOI 10.55524/IJIRCST.2023.11.4.2]

[10] Sharing Is Caring: Hurdles and Prospects of Open, Crowd-Sourced Cyber Threat Intelligence [J].

Jesus, Vitor ;

Bains, Balraj ;

Chang, Victor .

IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, 2023, 71 :6854-6873

← 1 2 3 4 →