Combining Text and Visual Features to Improve the Identification of Cloned Webpages for Early Phishing Detection

被引:15
作者
van Dooremaal, Bram [1 ]
Burda, Pavlo [1 ]
Allodi, Luca [1 ]
Zannone, Nicola [1 ]
机构
[1] Eindhoven Univ Technol, Eindhoven, Netherlands
来源
ARES 2021: 16TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY | 2021年
关键词
Phishing Detection; Target Identification; Visual Features; DISTANCE; IMAGES;
D O I
10.1145/3465481.3470112
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Phishing attacks arrive in high numbers and often spread quickly, meaning that after-the-fact countermeasures such as domain blacklisting are limited in efficacy. Visual similarity-based approaches have the potential of detecting previously unseen phishing webpages. These approaches, however, require identifying the legitimate webpage(s) they reproduce. Existing approaches rely on textual feature analysis for target identification, with misclassification rates of approximately 1%; however, as most websites a user might visit are legitimate, additional research is needed to further reduce classification errors. In this work, we propose a novel method for target identification that relies on both visual features (extracted from a screenshot of the web page) and textual features (extracted from the DOM of the web page) to identify which website a phishing web page is replicating, and assess its effectiveness in detecting phishing websites using data from phishing aggregators such as OpenPhish, PhishTank and PhishStats. Compared to state-of-the-art text-based classifiers, our method reduces the phishing misclassification rate by 67% (from 1.02% to 0.34%), for an accuracy of 99.66%. This work provides a further step forwards toward semi-automated decision support systems for phishing detection.
引用
收藏
页数:10
相关论文
共 44 条
[1]   Phishing detection based Associative Classification data mining [J].
Abdelhamid, Neda ;
Ayesh, Aladdin ;
Thabtah, Fadi .
EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (13) :5948-5959
[2]   VisualPhishNet: Zero-Day PhishingWebsite Detection by Visual Similarity [J].
Abdelnabi, Sahar ;
Krombholz, Katharina ;
Fritz, Mario .
CCS '20: PROCEEDINGS OF THE 2020 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2020, :1681-1698
[3]   Intelligent web-phishing detection and protection scheme using integrated features of Images, frames and text [J].
Adebowale, M. A. ;
Lwin, K. T. ;
Sanchez, E. ;
Hossain, M. A. .
EXPERT SYSTEMS WITH APPLICATIONS, 2019, 115 :300-313
[4]   PhishZoo: Detecting Phishing Websites By Looking at Them [J].
Afroz, Sadia ;
Greenstadt, Rachel .
FIFTH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2011), 2011, :368-375
[5]  
Alexa Internet Inc, 2021, ALEXA TOP SITES
[6]  
APWG, 2020, Phishing Activity Trends Report 4th Quarter 2020
[7]  
Borquez J, 2020, CONVERT ANY IMAGE PU
[8]  
Bradski G., 2008, OPENCV
[9]   Leverage Website Favicon to Detect Phishing Websites [J].
Chiew, Kang Leng ;
Choo, Jeffrey Soon-Fatt ;
Sze, San Nah ;
Yong, Kelvin S. C. .
SECURITY AND COMMUNICATION NETWORKS, 2018,
[10]   Utilisation of website logo for phishing detection [J].
Chiew, Kang Leng ;
Chang, Ee Hung ;
Sze, San Nah ;
Tiong, Wei King .
COMPUTERS & SECURITY, 2015, 54 :16-26