IRGen: Generative Modeling for Image Retrieval

被引:0
|
作者
Zhang, Yidan [1 ,2 ]
Zhang, Ting [1 ]
Chen, Dong [3 ]
Wang, Yujing [3 ]
Chen, Qi [3 ]
Xie, Xing [3 ]
Sun, Hao [3 ]
Deng, Weiwei [3 ]
Zhang, Qi [3 ]
Yang, Fan [3 ]
Yang, Mao [3 ]
Liao, Qingmin [5 ]
Wang, Jingdong [4 ]
Guo, Baining [3 ]
机构
[1] Beijing Normal Univ, Beijing, Peoples R China
[2] Univ Tokyo, Bunkyo City, Japan
[3] Microsoft Corp, Redmond, WA 98052 USA
[4] Baidu, Beijing, Peoples R China
[5] Tsinghua Univ, Shenzhen Int Grad Sch, Shenzhen, Peoples R China
来源
关键词
Image Retrieval; Autoregressive Model; Generative Model; PRODUCT QUANTIZATION; DEEP QUANTIZATION; NEAREST-NEIGHBOR; NETWORK; END;
D O I
10.1007/978-3-031-72633-0_2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While generative modeling has become prevalent across numerous research fields, its integration into the realm of image retrieval remains largely unexplored and underjustified. In this paper, we present a novel methodology, reframing image retrieval as a variant of generative modeling and employing a sequence-to-sequence model. This approach is harmoniously aligned with the current trend towards unification in research, presenting a cohesive framework that allows for end-to-end differentiable searching. This, in turn, facilitates superior performance via direct optimization techniques. The development of our model, dubbed IRGen, addresses the critical technical challenge of converting an image into a concise sequence of semantic units, which is pivotal for enabling efficient and effective search. Extensive experiments demonstrate that our model achieves state-of-the-art performance on three widely-used image retrieval benchmarks as well as two million-scale datasets, yielding significant improvement compared to prior competitive retrieval methods. In addition, the notable surge in precision scores facilitated by generative modeling presents the potential to bypass the reranking phase, which is traditionally indispensable in practical retrieval workflows. The code is publicly available at https://github.com/yakt00/IRGen.
引用
收藏
页码:21 / 41
页数:21
相关论文
共 50 条
  • [41] Multilateral Semantic Relations Modeling for Image Text Retrieval
    Wang, Zheng
    Gaol, Zhenwei
    Guol, Kangshuai
    Yang, Yang
    Wang, Xiaorning
    Shen, Heng Tao
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2830 - 2839
  • [42] Cross-Domain Image Retrieval with Attention Modeling
    Ji, Xin
    Wang, Wei
    Zhang, Meihui
    Yang, Yang
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1654 - 1662
  • [43] Image Content Modeling and Retrieval Using Sparse Representation
    Ranjan, Raju
    Gupta, Sumana
    Venkatesh, K. S.
    2015 THIRD INTERNATIONAL CONFERENCE ON IMAGE INFORMATION PROCESSING (ICIIP), 2015, : 358 - 361
  • [44] A novel model for medical image modeling and similarity retrieval
    Pan, H.-W. (panhaiwei@hrbeu.edu.cn), 1745, Science Press (36):
  • [45] GANZZLE: REFRAMING JIGSAW PUZZLE SOLVING AS A RETRIEVAL TASK USING A GENERATIVE MENTAL IMAGE
    Talon, Davide
    Del Bue, Alessio
    James, Stuart
    Proceedings - International Conference on Image Processing, ICIP, 2022, : 4083 - 4087
  • [46] GANZZLE: REFRAMING JIGSAW PUZZLE SOLVING AS A RETRIEVAL TASK USING A GENERATIVE MENTAL IMAGE
    Talon, Davide
    Del Bue, Alessio
    James, Stuart
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 4083 - 4087
  • [47] Query is GAN: Scene Retrieval With Attentional Text-to-Image Generative Adversarial Network
    Yanagi, Rintaro
    Togo, Ren
    Ogawa, Takahiro
    Haseyama, Miki
    IEEE ACCESS, 2019, 7 : 153183 - 153193
  • [48] Modeling image data for effective indexing and retrieval in large general image databases
    Li, Xiaoyan
    Shou, Lidan
    Chen, Gang
    Hu, Tianlei
    Dong, Jinxiang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (11) : 1566 - 1580
  • [49] Topic modeling and improvement of image representation for large-scale image retrieval
    Nguyen Anh Tu
    Dong-Luong Dinh
    Rasel, Mostofa Kamal
    Lee, Young-Koo
    INFORMATION SCIENCES, 2016, 366 : 99 - 120
  • [50] Learning to Tokenize for Generative Retrieval
    Sun, Weiwei
    Yan, Lingyong
    Chen, Zheng
    Wang, Shuaiqiang
    Zhu, Haichao
    Ren, Pengjie
    Chen, Zhumin
    Yin, Dawei
    de Rijke, Maarten
    Ren, Zhaochun
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,