Structured Neural Motifs: Scene Graph Parsing via Enhanced Context

被引：3

作者：

Li, Yiming ^{[1
,4
]}

Yang, Xiaoshan ^{[2
,3
,4
]}

Xu, Changsheng ^{[1
,2
,3
,4
]}

机构：

[1] HeFei Univ Technol, Hefei, Peoples R China

[2] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China

[3] Univ Chinese Acad Sci, Beijing, Peoples R China

[4] Peng Cheng Lab, Shenzhen, Peoples R China

来源：

MULTIMEDIA MODELING (MMM 2020), PT II | 2020年 / 11962卷

基金：

中国国家自然科学基金;

关键词：

Scene graph; Deep learning; LSTMs;

D O I：

10.1007/978-3-030-37734-2_15

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Scene graph is one kind of structured representation of the visual content in an image. It is helpful for complex visual understanding tasks such as image captioning, visual question answering and semantic image retrieval. Since the real-world images always have multiple object instances and complex relationships, the context information is extremely important for scene graph generation. It has been noted that the context dependencies among different nodes in the scene graph are asymmetric, which meas it is highly possible to directly predict relationship labels based on object labels but not vice-versa. Based on this finding, the existing motifs network has successfully exploited the context patterns among object nodes and the dependencies between the object nodes and the relation nodes. However, the spatial information and the context dependencies among relation nodes are neglected. In this work, we propose Structured Motif Network (StrcMN) which predicts object labels and pairwise relationships by mining more complete global context features. The experiments show that our model significantly outperforms previous methods on the VRD and Visual Genome datasets.

引用

页码：175 / 188

页数：14

共 50 条

[1] Learning to transfer focus of graph neural network for scene graph parsing
Jiang, Junjie
He, Zaixing
Zhang, Shuyou
Zhao, Xinyue
Tan, Jianrong
PATTERN RECOGNITION, 2021, 112
[2] Zero-Shot Predicate Prediction for Scene Graph Parsing
Li, Yiming
Yang, Xiaoshan
Huang, Xuhui
Ma, Zhe
Xu, Changsheng
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 3140 - 3153
[3] Interactive Natural Language Grounding via Referring Expression Comprehension and Scene Graph Parsing
Mi, Jinpeng
Lyu, Jianzhi
Tang, Song
Li, Qingdu
Zhang, Jianwei
FRONTIERS IN NEUROROBOTICS, 2020, 14
[4] Scene Graph Inference via Multi-Scale Context Modeling
Xu, Ning
Liu, An-An
Wong, Yongkang
Nie, Weizhi
Su, Yuting
Kankanhalli, Mohan
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (03) : 1031 - 1041
[5] Enhanced Context Learning with Transformer for Human Parsing
Song, Jingya
Shi, Qingxuan
Li, Yihang
Yang, Fang
APPLIED SCIENCES-BASEL, 2022, 12 (15):
[6] ECNet: An Efficient and Context-Aware Network for Street Scene Parsing
Jiang, Bin
Tu, Wenxuan
Yang, Chao
Xiao, Yi
2018 9TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES, ALGORITHMS AND PROGRAMMING (PAAP 2018), 2018, : 202 - 210
[7] A Graph-based Context Learning Technique for Image Parsing
Azam, Basim
Verma, Brijesh
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[8] Boosting Scene Parsing Performance via Reliable Scale Prediction
Shi, Hengcan
Li, Hongliang
Wu, Qingbo
Meng, Fanman
Ngan, King N.
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 492 - 500
[9] GEBNet: Graph-Enhancement Branch Network for RGB-T Scene Parsing
Dong, Shaohua
Zhou, Wujie
Qian, Xiaohong
Yu, Lu
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2273 - 2277
[10] Graph Neural Network based Scene Change Detection Using Scene Graph Embedding with Hybrid Classification Loss
Kim, Soyeon
Joo, Kyung-no
Youn, Chan-Hyun
12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 190 - 195

← 1 2 3 4 5 →