SceneSketcher-v2: Fine-Grained Scene-Level Sketch-Based Image Retrieval Using Adaptive GCNs

被引：10

作者：

Liu, Fang ^{[1
,2
]}

Deng, Xiaoming ^{[3
,4
,5
]}

Zou, Changqing

Lai, Yu-Kun ^{[8
]}

Chen, Keqi ^{[3
,4
,5
]}

Zuo, Ran ^{[3
,4
,5
,6
,7
]}

Ma, Cuixia ^{[3
,4
,5
]}

Liu, Yong-Jin ^{[9
]}

Wang, Hongan ^{[3
,4
,5
]}

机构：

[1] Univ Chinese Acad Sci, Chinese Acad Sci, Inst Software, Beijing 100190, Peoples R China

[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China

[3] Chinese Acad Sci, Inst Software, State Key Lab Comp Sci, Beijing 100190, Peoples R China

[4] Chinese Acad Sci, Inst Software, Beijing Key Lab Human Comp Interact, Beijing 100190, Peoples R China

[5] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing Key Lab Human Comp Interact, Beijing 100190, Peoples R China

[6] Zhejiang Univ, State Key Lab CAD&CG, Hangzhou 310027, Peoples R China

[7] Zhejiang Lab, Res Ctr Artificial Intelligence & Fine Arts, Hangzhou 310058, Peoples R China

[8] Cardiff Univ, Sch Comp Sci & Informat, Cardiff CF24 4AG, Wales

[9] Tsinghua Univ, Dept Comp Sci & Technol, MOE Key Lab Pervas Comp, BNRist, Beijing 100084, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2022年 / 31卷

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

Image retrieval; Semantics; Visualization; Layout; Task analysis; Electronic mail; Adaptation models; Sketch-based image retrieval; graph convolutional network; scene sketch; fine-grained image retrieval; DESCRIPTOR; ALIGNMENT;

D O I：

10.1109/TIP.2022.3175403

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Sketch-based image retrieval (SBIR) is a long-standing research topic in computer vision. Existing methods mainly focus on category-level or instance-level image retrieval. This paper investigates the fine-grained scene-level SBIR problem where a free-hand sketch depicting a scene is used to retrieve desired images. This problem is useful yet challenging mainly because of two entangled facts: 1) achieving an effective representation of the input query data and scene-level images is difficult as it requires to model the information across multiple modalities such as object layout, relative size and visual appearances, and 2) there is a great domain gap between the query sketch input and target images. We present SceneSketcher-v2, a Graph Convolutional Network (GCN) based architecture to address these challenges. SceneSketcher-v2 employs a carefully designed graph convolution network to fuse the multi-modality information in the query sketch and target images and uses a triplet training process and end-to-end training manner to alleviate the domain gap. Extensive experiments demonstrate SceneSketcher-v2 outperforms state-of-the-art scene-level SBIR models with a significant margin.

引用

页码：3737 / 3751

页数：15

共 62 条

[1] [Anonymous], WORD2VEC SOFTWARE
[2] Belongie S, 2001, ADV NEUR IN, V13, P831
[3] Bhunia AK, 2020, PROC CVPR IEEE, P9776, DOI 10.1109/CVPR42600.2020.00980
[4] Bochkovskiy A., 2020, ARXIV PREPRINT ARXIV ARXIV ARXIV200410934
[5] Sketching out the details: Sketch-based image retrieval using convolutional neural networks with multi-stage regression
Bui, Tu
Ribeiro, Leonardo
Ponti, Moacir
Collomosse, John
[J]. COMPUTERS & GRAPHICS-UK, 2018, 71 : 77 - 87
[6] COCO-Stuff: Thing and Stuff Classes in Context
Caesar, Holger
Uijlings, Jasper
Ferrari, Vittorio
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1209 - 1218
[7] Cao Y, 2011, PROC CVPR IEEE, P761, DOI 10.1109/CVPR.2011.5995460
[8] Learning Aligned Cross-Modal Representations from Weakly Aligned Data
Castrejon, Lluis
Aytar, Yusuf
Vondrick, Carl
Pirsiavash, Hamed
Torralba, Antonio
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2940 - 2949
[9] Sketch2Photo: Internet Image Montage
Chen, Tao
Cheng, Ming-Ming
Tan, Ping
Shamir, Ariel
Hu, Shi-Min
[J]. ACM TRANSACTIONS ON GRAPHICS, 2009, 28 (05): : 1 - 10
[10] Multi-Label Image Recognition with Graph Convolutional Networks
Chen, Zhao-Min
Wei, Xiu-Shen
Wang, Peng
Guo, Yanwen
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5172 - 5181

← 1 2 3 4 5 6 7 →