Structured Query-Based Image Retrieval Using Scene Graphs

被引：32

作者：

Schroeder, Brigit ^{[1
]}

Tripathi, Subarna ^{[2
]}

机构：

[1] Univ Calif Santa Cruz, Santa Cruz, CA 95064 USA

[2] Intel Labs, Santa Clara, CA USA

来源：

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020) | 2020年

关键词：

D O I：

10.1109/CVPRW50498.2020.00097

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A structured query can capture the complexity of object interactions (e.g. 'woman rides motorcycle') unlike single objects (e.g. 'woman' or 'motorcycle'). Retrieval using structured queries therefore is much more useful than single object retrieval, but a much more challenging problem. In this paper we present a method which uses scene graph embeddings as the basis for an approach to image retrieval. We examine how visual relationships, derived from scene graphs, can be used as structured queries. The visual relationships are directed subgraphs of the scene graph with a subject and object as nodes connected by a predicate relationhship. Notably, we are able to achieve high recall even on low to medium frequency objects found in the long-tailed COCO-Stuff dataset, and find that adding a visual relationship-inspired loss boosts our recall by 10% in the best case.

引用

页码：680 / 684

页数：5

共 17 条

[1] Ashual O., 2019, Specifying object attributes and relations in interactive scene generation
[2] Belilovsky E., 2017, INT C LEARN REPR WOR
[3] COCO-Stuff: Thing and Stuff Classes in Context
Caesar, Holger
Uijlings, Jasper
Ferrari, Vittorio
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1209 - 1218
[4] Image Generation from Scene Graphs
Johnson, Justin
Gupta, Agrim
Li Fei-Fei
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1219 - 1228
[5] Johnson J, 2015, PROC CVPR IEEE, P3668, DOI 10.1109/CVPR.2015.7298990
[6] Jyothi Akash Abdu, 2019, ICCV
[7] Lan T, 2012, LECT NOTES COMPUT SC, V7577, P129, DOI 10.1007/978-3-642-33783-3_10
[8] Microsoft COCO: Common Objects in Context
Lin, Tsung-Yi
Maire, Michael
Belongie, Serge
Hays, James
Perona, Pietro
Ramanan, Deva
Dollar, Piotr
Zitnick, C. Lawrence
[J]. COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 : 740 - 755
[9] Mittal Gaurav, 2019, CORR
[10] Triplet-Aware Scene Graph Embeddings
Schroeder, Brigit
Tripathi, Subarna
Tang, Hanlin
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1783 - 1787

← 1 2 →