MLMG-SGG: Multilabel Scene Graph Generation With Multigrained Features

被引：2

作者：

Li, Xuewei ^{[1
]}

Miao, Peihan ^{[2
]}

Li, Songyuan ^{[1
]}

Li, Xi ^{[1
,3
,4
]}

机构：

[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China

[2] Zhejiang Univ, Sch Software Technol, Ningbo 315048, Peoples R China

[3] Zhejiang Univ, Shanghai Inst Adv Study, Shanghai 201203, Peoples R China

[4] Shanghai AI Lab, Shanghai 200232, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2024年 / 33卷

关键词：

Pipelines; Feature extraction; Detectors; Image edge detection; Task analysis; Visualization; Object detection; Scene graph generation; multi-grained; multi-label classification; LANGUAGE; NETWORK;

D O I：

10.1109/TIP.2022.3199089

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As an important and challenging problem in computer vision, scene graph generation (SGG) aims to find out the underlying semantic relationships among objects from a given image for scene understanding. Usually, prevalent SGG approaches adopt a learning pipeline with the assumption that there exists only a single relationship for a particular object pair. Considering the common phenomenon that a pair of objects can be attached by multiple relationships, we propose a multi-label scene graph generation pipeline with multi-grained features (MLMG-SGG), which formulates the relationship detection as a multi-label classification problem during training while generating multigraphs at inference time. In order to better model the fine-grained relationships, the proposed pipeline encodes the feature representation of SGG on different spatial scales by a specially designed Multi-Grained Module (MGM), resulting in the multi-grained (i.e., object-level and region-level) features of objects. Experimental results over the benchmark dataset demonstrate the significant performance gain of the proposed pipeline used as a plug-in for the state-of-the-art methods.

引用

页码：1549 / 1559

页数：11

共 67 条

[1]

Anderson P, 2018, PROC CVPR IEEE, P6077, DOI [10.1002/ett.70087, 10.1109/CVPR.2018.00636]

[2] Counterfactual Critic Multi-Agent Training for Scene Graph Generation [J].

Chen, Long ;

Zhang, Hanwang ;

Xiao, Jun ;

He, Xiangnan ;

Pu, Shiliang ;

Chang, Shih-Fu .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :4612-4622

[3] Knowledge-Embedded Routing Network for Scene Graph Generation [J].

Chen, Tianshui ;

Yu, Weihao ;

Chen, Riquan ;

Lin, Liang .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :6156-6164

[4] Detecting Visual Relationships with Deep Relational Networks [J].

Dai, Bo ;

Zhang, Yuqi ;

Lin, Dahua .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3298-3308

[5]

Flanigan J, 2014, PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P1426

[6] Hierarchical Reasoning Network for Human-Object Interaction Detection [J].

Gao, Yiming ;

Kuang, Zhanghui ;

Li, Guanbin ;

Zhang, Wayne ;

Lin, Liang .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :8306-8317

[7]

Gardner M., 2018, P 56 ANN M ASS COMP, P17, DOI DOI 10.18653/V1

[8] Scene Graph Generation with External Knowledge and Image Reconstruction [J].

Gu, Jiuxiang ;

Zhao, Handong ;

Lin, Zhe ;

Li, Sheng ;

Cai, Jianfei ;

Ling, Mingyang .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :1969-1978

[9] Re-Attention for Visual Question Answering [J].

Guo, Wenya ;

Zhang, Ying ;

Yang, Jufeng ;

Yuan, Xiaojie .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :6730-6743

[10] Relation Regularized Scene Graph Generation [J].

Guo, Yuyu ;

Gao, Lianli ;

Song, Jingkuan ;

Wang, Peng ;

Sebe, Nicu ;

Shen, Heng Tao ;

Li, Xuelong .

IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (07) :5961-5972

← 1 2 3 4 5 6 7 →