Zero-Shot Everything Sketch-Based Image Retrieval, and in Explainable Style

被引:17
|
作者
Lin, Fengyin [1 ]
Li, Mingkang [1 ]
Li, Da [2 ]
Hospedales, Timothy [2 ,3 ]
Song, Yi-Zhe [4 ]
Qi, Yonggang [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
[2] Samsung AI Ctr, Cambridge, England
[3] Univ Edinburgh, Edinburgh, Midlothian, Scotland
[4] Univ Surrey, SketchX, CVSSP, Guildford, Surrey, England
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年
关键词
D O I
10.1109/CVPR52729.2023.02236
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies the problem of zero-short sketch-based image retrieval (ZS-SBIR), however with two significant differentiators to prior art (i) we tackle all variants (inter-category, intra-category, and cross datasets) of ZS-SBIR with just one network ("everything"), and (ii) we would really like to understand how this sketch-photo matching operates ("explainable"). Our key innovation lies with the realization that such a cross-modal matching problem could be reduced to comparisons of groups of key local patches - akin to the seasoned "bag-of-words" paradigm. Just with this change, we are able to achieve both of the aforementioned goals, with the added benefit of no longer requiring external semantic knowledge. Technically, ours is a transformer-based cross-modal network, with three novel components (i) a self-attention module with a learnable tokenizer to produce visual tokens that correspond to the most informative local regions, (ii) a cross-attention module to compute local correspondences between the visual tokens across two modalities, and finally (iii) a kernel-based relation network to assemble local putative matches and produce an overall similarity metric for a sketch-photo pair. Experiments show ours indeed delivers superior performances across all ZS-SBIR settings. The all important explainable goal is elegantly achieved by visualizing cross-modal token correspondences, and for the first time, via sketch to photo synthesis by universal replacement of all matched photo patches. Code and model are available at https://github.com/buptLinfy/ZSE-SBIR.
引用
收藏
页码:23349 / 23358
页数:10
相关论文
共 50 条
  • [1] Generative Model for Zero-Shot Sketch-Based Image Retrieval
    Verma, Vinay Kumar
    Mishra, Aakansha
    Mishra, Ashish
    Rai, Piyush
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 704 - 713
  • [2] Triplet Bridge for Zero-Shot Sketch-Based Image Retrieval
    Zheng, Jiahao
    Tang, Yu
    Wu, Dapeng
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024,
  • [3] An efficient framework for zero-shot sketch-based image retrieval
    Tursun, Osman
    Denman, Simon
    Sridharan, Sridha
    Goan, Ethan
    Fookes, Clinton
    PATTERN RECOGNITION, 2022, 126
  • [4] StyleGuide: Zero-Shot Sketch-Based Image Retrieval Using Style-Guided Image Generation
    Dutta, Titir
    Singh, Anurag
    Biswas, Soma
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 2833 - 2842
  • [5] Transferable Coupled Network for Zero-Shot Sketch-Based Image Retrieval
    Wang, Hao
    Deng, Cheng
    Liu, Tongliang
    Tao, Dacheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 9181 - 9194
  • [6] Sharing Model Framework for Zero-Shot Sketch-Based Image Retrieval
    Ho, Yi-Hsuan
    Way, Der-Lor
    Shih, Zen-Chung
    COMPUTER GRAPHICS FORUM, 2023, 42 (07)
  • [7] Contour detection network for zero-shot sketch-based image retrieval
    Zhang, Qing
    Zhang, Jing
    Su, Xiangdong
    Bao, Feilong
    Gao, Guanglai
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (06) : 6781 - 6795
  • [8] Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval
    Dey, Sounak
    Riba, Pau
    Dutta, Anjan
    Llados, Josep
    Song, Yi-Zhe
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2174 - 2183
  • [9] Contour detection network for zero-shot sketch-based image retrieval
    Qing Zhang
    Jing Zhang
    Xiangdong Su
    Feilong Bao
    Guanglai Gao
    Complex & Intelligent Systems, 2023, 9 : 6781 - 6795
  • [10] Zero-shot Sketch-based Image Retrieval with Adaptive Balanced Discriminability and Generalizability
    Tian, Jialin
    Xu, Xing
    Cao, Zuo
    Zhang, Gong
    Shen, Fumin
    Yang, Yang
    PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 407 - 415