Structured Query-Based Image Retrieval Using Scene Graphs

被引:32
作者
Schroeder, Brigit [1 ]
Tripathi, Subarna [2 ]
机构
[1] Univ Calif Santa Cruz, Santa Cruz, CA 95064 USA
[2] Intel Labs, Santa Clara, CA USA
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020) | 2020年
关键词
D O I
10.1109/CVPRW50498.2020.00097
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A structured query can capture the complexity of object interactions (e.g. 'woman rides motorcycle') unlike single objects (e.g. 'woman' or 'motorcycle'). Retrieval using structured queries therefore is much more useful than single object retrieval, but a much more challenging problem. In this paper we present a method which uses scene graph embeddings as the basis for an approach to image retrieval. We examine how visual relationships, derived from scene graphs, can be used as structured queries. The visual relationships are directed subgraphs of the scene graph with a subject and object as nodes connected by a predicate relationhship. Notably, we are able to achieve high recall even on low to medium frequency objects found in the long-tailed COCO-Stuff dataset, and find that adding a visual relationship-inspired loss boosts our recall by 10% in the best case.
引用
收藏
页码:680 / 684
页数:5
相关论文
共 17 条
  • [1] Ashual O., 2019, Specifying object attributes and relations in interactive scene generation
  • [2] Belilovsky E., 2017, INT C LEARN REPR WOR
  • [3] COCO-Stuff: Thing and Stuff Classes in Context
    Caesar, Holger
    Uijlings, Jasper
    Ferrari, Vittorio
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1209 - 1218
  • [4] Image Generation from Scene Graphs
    Johnson, Justin
    Gupta, Agrim
    Li Fei-Fei
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1219 - 1228
  • [5] Johnson J, 2015, PROC CVPR IEEE, P3668, DOI 10.1109/CVPR.2015.7298990
  • [6] Jyothi Akash Abdu, 2019, ICCV
  • [7] Lan T, 2012, LECT NOTES COMPUT SC, V7577, P129, DOI 10.1007/978-3-642-33783-3_10
  • [8] Microsoft COCO: Common Objects in Context
    Lin, Tsung-Yi
    Maire, Michael
    Belongie, Serge
    Hays, James
    Perona, Pietro
    Ramanan, Deva
    Dollar, Piotr
    Zitnick, C. Lawrence
    [J]. COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 : 740 - 755
  • [9] Mittal Gaurav, 2019, CORR
  • [10] Triplet-Aware Scene Graph Embeddings
    Schroeder, Brigit
    Tripathi, Subarna
    Tang, Hanlin
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1783 - 1787