Explainable Machine Learning for Bag of Words-Based Phishing Detection

被引：1

作者：

Calzarossa, Maria Carla ^{[1
]}

Giudici, Paolo ^{[2
]}

Zieni, Rasha ^{[1
]}

机构：

[1] Univ Pavia, Dept Elect Comp & Biomed Engn, Pavia, Italy

[2] Univ Pavia, Dept Econ & Management, Pavia, Italy

来源：

EXPLAINABLE ARTIFICIAL INTELLIGENCE, XAI 2023, PT I | 2023年 / 1901卷

关键词：

Explainable machine learning; Phishing detection; Lorenz Zonoid;

D O I：

10.1007/978-3-031-44064-9_28

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Phishing is a fraudulent practice aimed at convincing individuals to reveal sensitive information, such as account credentials or credit card details, by clicking the links of malicious websites. To reduce the impacts of phishing, the timely identification of these websites is essential. For this purpose, machine learning models are often devised. In this paper, we address the problem of website phishing detection by proposing an explainable machine learning model based on bag of words features extracted from the content of the webpages. To select the most important features to be used in the model, we propose to employ the Lorenz Zonoid, the multidimensional generalization of the Gini coefficient. The resulting model is characterized by a good accuracy and it provides explanations of which words are most likely associated with phishing websites. In addition, the number of features retained is significantly reduced, thus making the model parsimonious and easier to interpret.

引用

页码：531 / 543

页数：13

共 22 条

[1] Blum A., 2010, P 3 ACM WORKSH ART I, P54
[2] Bracke P., 2019, Staff Working Paper, V816
[3] Explainable Machine Learning in Credit Risk Management
Bussmann, Niklas
Giudici, Paolo
Marinelli, Dimitri
Papenbrock, Jochen
[J]. COMPUTATIONAL ECONOMICS, 2021, 57 (01) : 203 - 216
[4] Explainable machine learning for phishing feature detection
Calzarossa, Maria Carla
Giudici, Paolo
Zieni, Rasha
[J]. QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2024, 40 (01) : 362 - 373
[5] DeltaPhish: Detecting Phishing Webpages in Compromised Websites
Corona, Igino
Biggio, Battista
Contini, Matteo
Piras, Luca
Corda, Roberto
Mereu, Mauro
Mureddu, Guido
Ariu, Davide
Roli, Fabio
[J]. COMPUTER SECURITY - ESORICS 2017, PT I, 2018, 10492 : 370 - 388
[6] Shapley-Lorenz eXplainable Artificial Intelligence
Giudici, Paolo
Raffinetti, Emanuela
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 167
[7] Lorenz Model Selection
Giudici, Paolo
Raffinetti, Emanuela
[J]. JOURNAL OF CLASSIFICATION, 2020, 37 (03) : 754 - 768
[8] Phishing Detection Using URL-based XAI Techniques
Hernandes Jr, Paulo R. Galego
Floret, Camila P.
de Almeida, Katia F. Cardozo
da Silva, Vinicius Camargo
Papa, Joso Paulo
da Costa, Kelton A. Pontara
[J]. 2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
[9] A machine learning based approach for phishing detection using hyperlinks information
Jain, Ankit Kumar
Gupta, B. B.
[J]. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2019, 10 (05) : 2015 - 2028
[10] The Lorenz zonoid of a multivariate distribution
Koshevoy, G
Mosler, K
[J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1996, 91 (434) : 873 - 882

← 1 2 3 →