Spam Email Categorization with NLP and Using Federated Deep Learning

被引:1
作者
Ul Haq, Ikram [1 ]
Black, Paul [1 ]
Gondal, Iqbal [1 ]
Kamruzzaman, Joarder [1 ]
Watters, Paul [2 ]
Kayes, A. S. M. [2 ]
机构
[1] Federat Univ, ICSL, Sch Sci Engn & Informat Technol, Melbourne, Vic, Australia
[2] La Trobe Univ, Dept Comp Sci & Informat Technol, Ballarat, Vic, Australia
来源
ADVANCED DATA MINING AND APPLICATIONS, ADMA 2022, PT II | 2022年 / 13726卷
关键词
Spam detection; Phishing detection; Federated learning; Model averaging; Deep learning; Privacy-preserving; TF/IDF; Incremental learning;
D O I
10.1007/978-3-031-22137-8_2
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Emails are the most popular and efficient communication method that makes them vulnerable to misuse. Federated learning (FL) provides a decentralized machine learning (ML) model, where a central server coordinates clients that collaboratively train a sharedML model. This paper proposes Federated Phishing Filtering (FPF) technique based on federated learning, natural language processing, and deep learning. FL for intelligent algorithms fuses trained models of ML algorithms from multiple sites for collective learning. This approach improvesML performance by utilizing large collective training data sets across the corporate client base, resulting in higher phishing email detection accuracy. FPF techniques preserve email privacy using local feature extraction on client email servers. Thus, the contents of emails do not need to be transmitted across the network or stored on third-party servers. We have applied FL and Natural Language Processing (NLP) for email phishing detection. This technique provides four training modes that perform FL without sharing email content. Our research categorizes emails as benign, spam, and phishing. Empirical evaluations with publicly available datasets show that accuracy is improved by the use of our Federated Deep Learning model.
引用
收藏
页码:15 / 27
页数:13
相关论文
共 21 条
[1]  
Abhila B., 2021, 2021 INT C ICSCAN, P1
[2]  
[Anonymous], 2015, Phishing websites features
[3]  
Heaton JB, 2018, Arxiv, DOI arXiv:1602.06561
[4]  
Buber E, 2017, 2017 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), P337, DOI 10.1109/UBMK.2017.8093406
[5]  
Chi-Yao Tseng, 2009, 2009 International Conference on Computational Science and Engineering (CSE), P128, DOI 10.1109/CSE.2009.260
[6]  
Chirita P. A., 2005, P 14 ACM INT C INFOR, P373, DOI DOI 10.1145/1099554.1099671
[7]  
Damiani E, 2004, PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS, P559
[8]  
Drake C.E., 2004, CEAS
[9]  
Gepperth A, 2016, EUROPEAN S ARTIFICIA
[10]  
Gopalakrishnan R., 2018, Machine Learning for Mobile: Practical guide to building intelligent mobile applications powered by machine learning