Detecting Visually Similar Web Pages: Application to Phishing Detection

被引:59
作者
Chen, Teh-Chung [1 ]
Dick, Scott [1 ]
Miller, James [1 ]
机构
[1] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T6G 2M7, Canada
关键词
Security; Human Factors; Algorithmic complexity theory; Gestalt theory; Web page similarity; anti-phishing technologies; IMAGE; CLASSIFICATION; DISTANCE;
D O I
10.1145/1754393.1754394
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose a novel approach for detecting visual similarity between two Web pages. The proposed approach applies Gestalt theory and considers a Web page as a single indivisible entity. The concept of supersignals, as a realization of Gestalt principles, supports our contention that Web pages must be treated as indivisible entities. We objectify, and directly compare, these indivisible supersignals using algorithmic complexity theory. We illustrate our approach by applying it to the problem of detecting phishing scams. Via a large-scale, real-world case study, we demonstrate that 1) our approach effectively detects similar Web pages; and 2) it accuractely distinguishes legitimate and phishing pages.
引用
收藏
页数:38
相关论文
共 80 条
  • [1] Andresen D., 1996, P IEEE FOR RES TECHN
  • [2] [Anonymous], P ANN NETW DISTR SYS
  • [3] [Anonymous], 2002, P ACM S THEOR COMP
  • [4] [Anonymous], 2006, P SIGCHI C HUM FACT
  • [5] [Anonymous], CLUSTERING IMAGES US
  • [6] Seam carving for content-aware image resizing
    Avidan, Shai
    Shamir, Ariel
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2007, 26 (03):
  • [7] No-reference JPEG-image quality assessment using GAP-RBF
    Babu, R. Venkatesh
    Suresh, S.
    Perkis, Andrew
    [J]. SIGNAL PROCESSING, 2007, 87 (06) : 1493 - 1503
  • [8] Assessing the accuracy of prediction algorithms for classification: an overview
    Baldi, P
    Brunak, S
    Chauvin, Y
    Andersen, CAF
    Nielsen, H
    [J]. BIOINFORMATICS, 2000, 16 (05) : 412 - 424
  • [9] Bardera A., 2006, P IEEE INT S INF THE
  • [10] No-reference image quality assessment based on DCT domain statistics
    Brandao, Tomas
    Queluz, Maria Paula
    [J]. SIGNAL PROCESSING, 2008, 88 (04) : 822 - 833