MLMG-SGG: Multilabel Scene Graph Generation With Multigrained Features

被引:2
作者
Li, Xuewei [1 ]
Miao, Peihan [2 ]
Li, Songyuan [1 ]
Li, Xi [1 ,3 ,4 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China
[2] Zhejiang Univ, Sch Software Technol, Ningbo 315048, Peoples R China
[3] Zhejiang Univ, Shanghai Inst Adv Study, Shanghai 201203, Peoples R China
[4] Shanghai AI Lab, Shanghai 200232, Peoples R China
关键词
Pipelines; Feature extraction; Detectors; Image edge detection; Task analysis; Visualization; Object detection; Scene graph generation; multi-grained; multi-label classification; LANGUAGE; NETWORK;
D O I
10.1109/TIP.2022.3199089
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As an important and challenging problem in computer vision, scene graph generation (SGG) aims to find out the underlying semantic relationships among objects from a given image for scene understanding. Usually, prevalent SGG approaches adopt a learning pipeline with the assumption that there exists only a single relationship for a particular object pair. Considering the common phenomenon that a pair of objects can be attached by multiple relationships, we propose a multi-label scene graph generation pipeline with multi-grained features (MLMG-SGG), which formulates the relationship detection as a multi-label classification problem during training while generating multigraphs at inference time. In order to better model the fine-grained relationships, the proposed pipeline encodes the feature representation of SGG on different spatial scales by a specially designed Multi-Grained Module (MGM), resulting in the multi-grained (i.e., object-level and region-level) features of objects. Experimental results over the benchmark dataset demonstrate the significant performance gain of the proposed pipeline used as a plug-in for the state-of-the-art methods.
引用
收藏
页码:1549 / 1559
页数:11
相关论文
共 67 条
[1]  
Anderson P, 2018, PROC CVPR IEEE, P6077, DOI [10.1002/ett.70087, 10.1109/CVPR.2018.00636]
[2]   Counterfactual Critic Multi-Agent Training for Scene Graph Generation [J].
Chen, Long ;
Zhang, Hanwang ;
Xiao, Jun ;
He, Xiangnan ;
Pu, Shiliang ;
Chang, Shih-Fu .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :4612-4622
[3]   Knowledge-Embedded Routing Network for Scene Graph Generation [J].
Chen, Tianshui ;
Yu, Weihao ;
Chen, Riquan ;
Lin, Liang .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :6156-6164
[4]   Detecting Visual Relationships with Deep Relational Networks [J].
Dai, Bo ;
Zhang, Yuqi ;
Lin, Dahua .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3298-3308
[5]  
Flanigan J, 2014, PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P1426
[6]   Hierarchical Reasoning Network for Human-Object Interaction Detection [J].
Gao, Yiming ;
Kuang, Zhanghui ;
Li, Guanbin ;
Zhang, Wayne ;
Lin, Liang .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :8306-8317
[7]  
Gardner M., 2018, P 56 ANN M ASS COMP, P17, DOI DOI 10.18653/V1
[8]   Scene Graph Generation with External Knowledge and Image Reconstruction [J].
Gu, Jiuxiang ;
Zhao, Handong ;
Lin, Zhe ;
Li, Sheng ;
Cai, Jianfei ;
Ling, Mingyang .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :1969-1978
[9]   Re-Attention for Visual Question Answering [J].
Guo, Wenya ;
Zhang, Ying ;
Yang, Jufeng ;
Yuan, Xiaojie .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :6730-6743
[10]   Relation Regularized Scene Graph Generation [J].
Guo, Yuyu ;
Gao, Lianli ;
Song, Jingkuan ;
Wang, Peng ;
Sebe, Nicu ;
Shen, Heng Tao ;
Li, Xuelong .
IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (07) :5961-5972