IRGen: Generative Modeling for Image Retrieval

被引:0
|
作者
Zhang, Yidan [1 ,2 ]
Zhang, Ting [1 ]
Chen, Dong [3 ]
Wang, Yujing [3 ]
Chen, Qi [3 ]
Xie, Xing [3 ]
Sun, Hao [3 ]
Deng, Weiwei [3 ]
Zhang, Qi [3 ]
Yang, Fan [3 ]
Yang, Mao [3 ]
Liao, Qingmin [5 ]
Wang, Jingdong [4 ]
Guo, Baining [3 ]
机构
[1] Beijing Normal Univ, Beijing, Peoples R China
[2] Univ Tokyo, Bunkyo City, Japan
[3] Microsoft Corp, Redmond, WA 98052 USA
[4] Baidu, Beijing, Peoples R China
[5] Tsinghua Univ, Shenzhen Int Grad Sch, Shenzhen, Peoples R China
来源
关键词
Image Retrieval; Autoregressive Model; Generative Model; PRODUCT QUANTIZATION; DEEP QUANTIZATION; NEAREST-NEIGHBOR; NETWORK; END;
D O I
10.1007/978-3-031-72633-0_2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While generative modeling has become prevalent across numerous research fields, its integration into the realm of image retrieval remains largely unexplored and underjustified. In this paper, we present a novel methodology, reframing image retrieval as a variant of generative modeling and employing a sequence-to-sequence model. This approach is harmoniously aligned with the current trend towards unification in research, presenting a cohesive framework that allows for end-to-end differentiable searching. This, in turn, facilitates superior performance via direct optimization techniques. The development of our model, dubbed IRGen, addresses the critical technical challenge of converting an image into a concise sequence of semantic units, which is pivotal for enabling efficient and effective search. Extensive experiments demonstrate that our model achieves state-of-the-art performance on three widely-used image retrieval benchmarks as well as two million-scale datasets, yielding significant improvement compared to prior competitive retrieval methods. In addition, the notable surge in precision scores facilitated by generative modeling presents the potential to bypass the reranking phase, which is traditionally indispensable in practical retrieval workflows. The code is publicly available at https://github.com/yakt00/IRGen.
引用
收藏
页码:21 / 41
页数:21
相关论文
共 50 条
  • [11] Sketch Based Image Retrieval with Conditional Generative Adversarial Network
    Liu Y.
    Dou C.
    Zhao Q.
    Li Z.
    Li H.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao, 12 (2336-2342): : 2336 - 2342
  • [12] Unified Binary Generative Adversarial Network for Image Retrieval and Compression
    Jingkuan Song
    Tao He
    Lianli Gao
    Xing Xu
    Alan Hanjalic
    Heng Tao Shen
    International Journal of Computer Vision, 2020, 128 : 2243 - 2264
  • [13] Unified Binary Generative Adversarial Network for Image Retrieval and Compression
    Song, Jingkuan
    He, Tao
    Gao, Lianli
    Xu, Xing
    Hanjalic, Alan
    Shen, Heng Tao
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (8-9) : 2243 - 2264
  • [14] IMAGE RETRIEVAL WITH LINGUAL AND VISUAL PARAPHRASING VIA GENERATIVE MODELS
    Yanagi, Rintaro
    Togo, Ren
    Ogawa, Takahiro
    Haseyama, Miki
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2431 - 2435
  • [15] Image Retrieval Based on Hash Method and Generative Adversarial Networks
    Peng Yanfei
    Hong, Wu
    Zi Lingling
    LASER & OPTOELECTRONICS PROGRESS, 2018, 55 (10)
  • [16] Semi-supervised Generative Adversarial Hashing for Image Retrieval
    Wang, Guan'an
    Hu, Qinghao
    Cheng, Jian
    Hou, Zengguang
    COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 : 491 - 507
  • [17] Generative Image Modeling Using Spatial LSTMs
    Theis, Lucas
    Bethge, Matthias
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [18] RASU: Retrieval Augmented Speech Understanding through Generative Modeling
    Yang, Hao
    Zhang, Min
    Wang, Minghan
    Guo, Jiaxin
    INTERSPEECH 2024, 2024, : 3510 - 3514
  • [19] Sketch-based Image Retrieval using Generative Adversarial Networks
    Guo, Longteng
    Liu, Jing
    Wang, Yuhang
    Luo, Zhonghua
    Wen, Wei
    Lu, Hanqing
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1267 - 1268
  • [20] Generative Domain-Migration Hashing for Sketch-to-Image Retrieval
    Zhang, Jingyi
    Shen, Fumin
    Liu, Li
    Zhu, Fan
    Yu, Mengyang
    Shao, Ling
    Shen, Heng Tao
    Van Gool, Luc
    COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 : 304 - 321