Graph R-CNN for Scene Graph Generation

被引:585
作者
Yang, Jianwei [1 ]
Lu, Jiasen [1 ]
Lee, Stefan [1 ]
Batra, Dhruv [1 ,2 ]
Parikh, Devi [1 ,2 ]
机构
[1] Georgia Inst Technol, Atlanta, GA 30332 USA
[2] Facebook AI Res, Menlo Pk, CA USA
来源
COMPUTER VISION - ECCV 2018, PT I | 2018年 / 11205卷
关键词
Graph R-CNN; Scene graph generation; Relation proposal network; Attentional graph convolutional network;
D O I
10.1007/978-3-030-01246-5_41
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel scene graph generation model called Graph R-CNN, that is both effective and efficient at detecting objects and their relations in images. Our model contains a Relation Proposal Network (RePN) that efficiently deals with the quadratic number of potential relations between objects in an image. We also propose an attentional Graph Convolutional Network (aGCN) that effectively captures contextual information between objects and relations. Finally, we introduce a new evaluation metric that is more holistic and realistic than existing metrics. We report state-of-the-art performance on scene graph generation as evaluated using both existing and our proposed metrics.
引用
收藏
页码:690 / 706
页数:17
相关论文
共 46 条
[31]   The role of context in object recognition [J].
Oliva, Aude ;
Torralba, Antonio .
TRENDS IN COGNITIVE SCIENCES, 2007, 11 (12) :520-527
[32]  
Parikh D., 2008, CVPR
[33]   Weakly-supervised learning of visual relations [J].
Peyre, Julia ;
Laptev, Ivan ;
Schmid, Cordelia ;
Sivic, Josef .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5189-5198
[34]   Objects in context [J].
Rabinovich, Andrew ;
Vedaldi, Andrea ;
Galleguillos, Carolina ;
Wiewiora, Eric ;
Belongie, Serge .
2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, 2007, :1237-1244
[35]  
Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556
[36]  
Szegedy Christian, 2015, P IEEE C COMP VIS PA, P1, DOI [10.1109/cvpr.2015.7298594, DOI 10.1109/CVPR.2015.7298594]
[37]   Graph-Structured Representations for Visual Question Answering [J].
Teney, Damien ;
Liu, Lingqiao ;
van den Hengel, Anton .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3233-3241
[38]   FVQA: Fact-Based Visual Question Answering [J].
Wang, Peng ;
Wu, Qi ;
Shen, Chunhua ;
Dick, Anthony ;
van den Hengel, Anton .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (10) :2413-2427
[39]   Scene Flow to Action Map: A New Representation for RGB-D based Action Recognition with Convolutional Neural Networks [J].
Wang, Pichao ;
Li, Wanqing ;
Gao, Zhimin ;
Zhang, Yuyao ;
Tang, Chang ;
Ogunbona, Philip .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :416-425
[40]   Image Captioning and Visual Question Answering Based on Attributes and External Knowledge [J].
Wu, Qi ;
Shen, Chunhua ;
Wang, Peng ;
Dick, Anthony ;
van den Hengel, Anton .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (06) :1367-1381