Fast and robust duplicate image detection on the web

被引:2
|
作者
Gadeski, Etienne [1 ]
Le Borgne, Herve [1 ]
Popescu, Adrian [1 ]
机构
[1] CEA, Vis & Content Engn Lab, LIST, Gif Sur Yvette, France
关键词
Social media intelligence; Near duplicate detection; Copy detection; Visual web data; Image retrieval;
D O I
10.1007/s11042-016-3619-4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Social media intelligence is interested in detecting the massive propagation of similar visual content. It can be seen, under certain conditions, as a problem of detecting near duplicate images in a stream of web data. However, in the context considered, it requires not only an efficient indexing and searching algorithm but also to be fast to compute the image description, since the total time of description and searching must be short enough to satisfy the constraint induced by the web stream flow rate. While most of methods of the state of the art focus on the efficiency at searching time, we propose a new descriptor satisfying the aforementioned requirements. We evaluate our method on two different datasets with the use of different sets of distractor images, leading to large-scale image collections (up to 100 million images). We compare our method to the state of the art and show it exhibits among the best detection performances but is much faster (one to two orders of magnitude).
引用
收藏
页码:11839 / 11858
页数:20
相关论文
共 50 条
  • [31] Duplicate and near-duplicate documents in the web: detection by means of fuzzy-hash techniques
    Figuerola, Carlos G.
    Gomez Diaz, Raquel
    Alonso Berrocal, Jose L.
    Zazo Rodriguez, Angel F.
    SCIRE-REPRESENTACION Y ORGANIZACION DEL CONOCIMIENTO, 2011, 17 (01): : 49 - 54
  • [32] An Integrated Approach to Near-duplicate Image Detection
    Yang, Heesung
    Park, Hyeyoung
    2023 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION, ICAIIC, 2023, : 425 - 428
  • [33] Analysis of accounting models for the detection of duplicate requests in web services
    Venkatesan, S.
    Basha, M. S. Saleem
    Chellappan, C.
    Vaish, Anurika
    Dhavachelvan, P.
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2013, 25 (01) : 7 - 24
  • [34] Near Duplicate Web Page Detection With Analytic Feature Weighting
    Naseem, Rasia
    Anees, Sheena
    Muneer, K.
    Farook, Syed K.
    2013 THIRD INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING AND COMMUNICATIONS (ICACC 2013), 2013, : 324 - 327
  • [35] Benchmarking unsupervised near-duplicate image detection
    Morra, Lia
    Lamberti, Fabrizio
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 135 : 313 - 326
  • [36] Practical Application of Near Duplicate Detection for Image Database
    Eshkol, Adi
    Grega, Michal
    Leszczuk, Mikolaj
    Weintraub, Ofer
    MULTIMEDIA COMMUNICATIONS, SERVICES AND SECURITY, MCSS 2014, 2014, 429 : 73 - 82
  • [37] Fast Near-Duplicate Detection from Image Streams on Online Social Media during Disaster Events
    Layek, Ashish Kumar
    Gupta, Akash
    Ghosh, Saptarshi
    Mandal, Sekhar
    2016 IEEE ANNUAL INDIA CONFERENCE (INDICON), 2016,
  • [38] Application of bloom filter for duplicate URL detection in a web crawler
    Kapoor, Aveksha
    Arora, Vinay
    2016 IEEE 2ND INTERNATIONAL CONFERENCE ON COLLABORATION AND INTERNET COMPUTING (IEEE CIC), 2016, : 246 - 255
  • [39] Near-Duplicate Detection in Web App Model Inference
    Yandrapally, Rahulkrishna
    Stocco, Andrea
    Mesbah, Ali
    2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020), 2020, : 186 - 197
  • [40] Robust Spatial Consistency Graph Model for Partial Duplicate Image Retrieval
    Chu, Lingyang
    Jiang, Shuqiang
    Wang, Shuhui
    Zhang, Yanyan
    Huang, Qingming
    IEEE TRANSACTIONS ON MULTIMEDIA, 2013, 15 (08) : 1982 - 1996