IRGen: Generative Modeling for Image Retrieval

被引:0
|
作者
Zhang, Yidan [1 ,2 ]
Zhang, Ting [1 ]
Chen, Dong [3 ]
Wang, Yujing [3 ]
Chen, Qi [3 ]
Xie, Xing [3 ]
Sun, Hao [3 ]
Deng, Weiwei [3 ]
Zhang, Qi [3 ]
Yang, Fan [3 ]
Yang, Mao [3 ]
Liao, Qingmin [5 ]
Wang, Jingdong [4 ]
Guo, Baining [3 ]
机构
[1] Beijing Normal Univ, Beijing, Peoples R China
[2] Univ Tokyo, Bunkyo City, Japan
[3] Microsoft Corp, Redmond, WA 98052 USA
[4] Baidu, Beijing, Peoples R China
[5] Tsinghua Univ, Shenzhen Int Grad Sch, Shenzhen, Peoples R China
来源
关键词
Image Retrieval; Autoregressive Model; Generative Model; PRODUCT QUANTIZATION; DEEP QUANTIZATION; NEAREST-NEIGHBOR; NETWORK; END;
D O I
10.1007/978-3-031-72633-0_2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While generative modeling has become prevalent across numerous research fields, its integration into the realm of image retrieval remains largely unexplored and underjustified. In this paper, we present a novel methodology, reframing image retrieval as a variant of generative modeling and employing a sequence-to-sequence model. This approach is harmoniously aligned with the current trend towards unification in research, presenting a cohesive framework that allows for end-to-end differentiable searching. This, in turn, facilitates superior performance via direct optimization techniques. The development of our model, dubbed IRGen, addresses the critical technical challenge of converting an image into a concise sequence of semantic units, which is pivotal for enabling efficient and effective search. Extensive experiments demonstrate that our model achieves state-of-the-art performance on three widely-used image retrieval benchmarks as well as two million-scale datasets, yielding significant improvement compared to prior competitive retrieval methods. In addition, the notable surge in precision scores facilitated by generative modeling presents the potential to bypass the reranking phase, which is traditionally indispensable in practical retrieval workflows. The code is publicly available at https://github.com/yakt00/IRGen.
引用
收藏
页码:21 / 41
页数:21
相关论文
共 50 条
  • [21] The Infinite Index: Information Retrieval on Generative Text-To-Image Models
    Deckers, Niklas
    Froebe, Maik
    Kiesel, Johannes
    Pandolfo, Gianluca
    Schroeder, Christopher
    Stein, Benno
    Potthast, Martin
    PROCEEDINGS OF THE 2023 CONFERENCE ON HUMAN INFORMATION INTERACTION AND RETRIEVAL, CHIIR 2023, 2023, : 172 - 186
  • [22] Multi-negative samples with Generative Adversarial Networks for image retrieval
    Li, Ruifan
    Zhang, Xuesen
    Chen, Guang
    Mao, Yuzhao
    Wang, Xiaojie
    NEUROCOMPUTING, 2020, 394 : 146 - 157
  • [23] DUAL ADVERSARIAL AUTOENCODER FOR DERMOSCOPIC IMAGE GENERATIVE MODELING
    Yang, Hao-Yu
    Staib, Lawrence H.
    2019 IEEE 16TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2019), 2019, : 1247 - 1250
  • [24] Cortically-Coupled Generative Adversarial Network for Target Image Retrieval in Rapid Image Search
    Bagwe, Ruchi
    George, Kiran
    2020 IEEE SECOND INTERNATIONAL CONFERENCE ON COGNITIVE MACHINE INTELLIGENCE (COGMI 2020), 2020, : 227 - 233
  • [25] IRTF: Image retrieval through fuzzy modeling
    Lakdashti, Abolfazl
    Moin, M. Shahram
    Badie, Kambiz
    2008 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, PROCEEDINGS, VOLS 1-13, 2008, : 490 - +
  • [26] ASPECT MODELING OF PARSED REPRESENTATION FOR IMAGE RETRIEVAL
    Bae, Soo Hyun
    Juang, Biing-Hwang
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 1137 - 1140
  • [27] USER INTENTION MODELING FOR INTERACTIVE IMAGE RETRIEVAL
    Cui, Jingyu
    Wen, Fang
    Tang, Xiaoou
    2010 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2010), 2010, : 1517 - 1522
  • [28] MULTIVARIATE STATISTICAL MODELING FOR STEREO IMAGE RETRIEVAL
    Chaker, A.
    Kaaniche, M.
    Benazza-Benyahia, A.
    2014 5TH EUROPEAN WORKSHOP ON VISUAL INFORMATION PROCESSING (EUVIP 2014), 2014,
  • [29] IRFUM: IMAGE RETRIEVAL VIA FUZZY MODELING
    Ajorloo, Hossein
    Lakdashti, Abolfazl
    COMPUTING AND INFORMATICS, 2011, 30 (05) : 913 - 941
  • [30] Statistical modeling for automatic image indexing and retrieval
    Zhang, Baopeng
    Luo, Hangzai
    Fan, Jianping
    NEUROCOMPUTING, 2016, 207 : 105 - 119