Application of word embedding and machine learning in detecting phishing websites

被引:0
作者
Routhu Srinivasa Rao
Amey Umarekar
Alwyn Roshan Pais
机构
[1] GMR Institute of Technology,Department of Computer Science and Engineering
[2] National Institute of Technology,Information Security Research Lab, Department of Computer Science and Engineering
来源
Telecommunication Systems | 2022年 / 79卷
关键词
URL; Phishing; Anti-phishing; TF-IDF; Hostname; Random forest;
D O I
暂无
中图分类号
学科分类号
摘要
Phishing is an attack whose aim is to gain personal information such as passwords, credit card details etc. from online users by deceiving them through fake websites, emails or any legitimate internet service. There exists many techniques to detect phishing sites such as third-party based techniques, source code based methods and URL based methods but still users are getting trapped into revealing their sensitive information. In this paper, we propose a new technique which detects phishing sites with word embeddings using plain text and domain specific text extracted from the source code. We applied various word embedding for the evaluation of our model using ensemble and multimodal approaches. From the experimental evaluation, we observed that multimodal with domain specific text achieved a significant accuracy of 99.34% with TPR of 99.59%, FPR of 0.93%, and MCC of 98.68%
引用
收藏
页码:33 / 45
页数:12
相关论文
共 82 条
[1]  
Afzal S(2021)Urldeepdetect: A deep learning approach for detecting malicious urls using semantic vector models Journal of Network and Systems Management 29 1-27
[2]  
Asim M(2017)Phishing environments, techniques, and countermeasures: A survey Computers & Security 68 160-196
[3]  
Javed AR(2021)A comprehensive survey of AI-enabled phishing attacks detection techniques Telecommunication Systems 76 139-154
[4]  
Beg MO(2018)A survey of phishing attacks: Their types, vectors and technical approaches Expert Systems with Applications 106 1-20
[5]  
Baker T(2014)A comprehensive and efficacious architecture for detecting phishing webpages Computers & Security 40 23-37
[6]  
Aleroud A(2019)A stacking model using URL and HTML features for phishing webpage detection Future Generation Computer Systems 94 27-39
[7]  
Zhou L(2021)Hybrid rule-based solution for phishing URL detection using convolutional neural network Wireless Communications and Mobile Computing 2021 8241104-3873
[8]  
Basit A(2019)Detection of phishing websites using an efficient feature-based machine learning framework Neural Computing and Applications 31 3851-20
[9]  
Zafar M(2019)Two level filtering mechanism to detect phishing sites using lightweight visual similarity approach Journal of Ambient Intelligence and Humanized Computing 11 1-825
[10]  
Liu X(2019)Phishdump: A multi-model ensemble based technique for the detection of phishing sites in mobile devices Pervasive and Mobile Computing 60 084-5752