Effective near-duplicate image detection using perceptual hashing and deep learning

被引:0
作者
Jakhar, Yash [1 ]
Borah, Malaya Dutta [1 ]
机构
[1] Natl Inst Technol, Dept Comp Sci & Engn, Silchar, India
关键词
Near-duplicate images; Neural network; Generative Adversarial Network; Perceptual hashing; Siamese network; Vision Transformer;
D O I
10.1016/j.ipm.2025.104086
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Computer vision has always been concerned with near-duplicate image detection. Previous approaches for detecting near duplicates highlighted the necessity to adequately explore the aspect of image transformations for effectively handling complex images. We proposed a method of finding near duplicate images using the integration of three different techniques: perceptual hashing, Siamese network, and Vision Transformer. Perceptual hashing gives us a quick way to filter out similar-looking pictures, while the Siamese network architecture paired with the Vision transformer helps us identify more complex near duplicate instances. The integrated approach learns a metric space from data, which reflects both visual similarity and perceptual closeness among items in the dataset. The results demonstrate the effectiveness and robustness of our proposed method, achieving an AUROC of 0.99 and a precision of 0.987 on the California- ND dataset, and an AUROC of 0.92 with a precision of 0.884 on the INRIA Holidays dataset, significantly outperforming traditional methods by over 10% in both metrics. This represents a significant step forward in near-duplicate image detection research.
引用
收藏
页数:12
相关论文
共 38 条
  • [31] Song Jingkuan, 2018, P AAAI C ART INT
  • [32] High-performance image forgery detection via adaptive SIFT feature extraction for low-contrast or small or smooth copy-move region images
    Sujin, J. S.
    Sophia, S.
    [J]. SOFT COMPUTING, 2024, 28 (01) : 437 - 445
  • [33] Seagull optimization-based near-duplicate image detection in large image databases
    Sundaram, Srinidhi
    Kamalakkannan, S.
    Jayaraman, Sasikala
    [J]. IMAGING SCIENCE JOURNAL, 2023, 71 (07) : 647 - 659
  • [34] Wu BC, 2020, Arxiv, DOI [arXiv:2006.03677, 10.48550/arXiv.2006.03677, DOI 10.48550/ARXIV.2006.03677]
  • [35] An Integrated Approach to Near-duplicate Image Detection
    Yang, Heesung
    Park, Hyeyoung
    [J]. 2023 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION, ICAIIC, 2023, : 425 - 428
  • [36] The optimal level of copyright protection
    Yoon, K
    [J]. INFORMATION ECONOMICS AND POLICY, 2002, 14 (03) : 327 - 348
  • [37] Zheng N., 2007, Computer vision and pattern recognition
  • [38] Near-Duplicate Image Detection System Using Coarse-to-Fine Matching Scheme Based on Global and Local CNN Features
    Zhou, Zhili
    Lin, Kunde
    Cao, Yi
    Yang, Ching-Nung
    Liu, Yuling
    [J]. MATHEMATICS, 2020, 8 (04)