Learning Human-Object Interactions by Graph Parsing Neural Networks

被引:415
作者
Qi, Siyuan [1 ,2 ]
Wang, Wenguan [1 ,3 ]
Jia, Baoxiong [1 ,4 ]
Shen, Jianbing [3 ,5 ]
Zhu, Song-Chun [1 ,2 ]
机构
[1] Univ Calif Los Angeles, Los Angeles, CA USA
[2] Int Ctr AI & Robot Auton CARA, Los Angeles, CA USA
[3] Beijing Inst Technol, Beijing, Peoples R China
[4] Peking Univ, Beijing, Peoples R China
[5] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
来源
COMPUTER VISION - ECCV 2018, PT IX | 2018年 / 11213卷
关键词
Human-object interaction; Message passing; Graph parsing; Neural networks; AFFORDANCES;
D O I
10.1007/978-3-030-01240-3_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the task of detecting and recognizing human-object interactions (HOI) in images and videos. We introduce the Graph Parsing Neural Network (GPNN), a framework that incorporates structural knowledge while being differentiable end-to-end. For a given scene, GPNN infers a parse graph that includes (i) the HOI graph structure represented by an adjacency matrix, and (ii) the node labels. Within a message passing inference framework, GPNN iteratively computes the adjacency matrices and node labels. We extensively evaluate our model on three HOI detection benchmarks on images and videos: HICO-DET, V-COCO, and CAD-120 datasets. Our approach significantly outperforms state-of-art methods, verifying that GPNN is scalable to large datasets and applies to spatial-temporal settings.
引用
收藏
页码:407 / 423
页数:17
相关论文
共 50 条
[31]   The More You Know: Using Knowledge Graphs for Image Classification [J].
Marino, Kenneth ;
Salakhutdinov, Ruslan ;
Gupta, Abhinav .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :20-28
[32]  
Monti Federico, 2016, CVPR
[33]  
Niepert M, 2016, PR MACH LEARN RES, V48
[34]  
Qi S., 2017, ICCV
[35]  
Qi Siyuan., 2018, ICML
[36]   Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [J].
Ren, Shaoqing ;
He, Kaiming ;
Girshick, Ross ;
Sun, Jian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (06) :1137-1149
[37]  
Seo Youngjoo, 2016, Structured Sequence Modeling with Graph Convolutional Recurrent Networks
[38]   Scaling Human-Object Interaction Recognition through Zero-Shot Learning [J].
Shen, Liyue ;
Yeung, Serena ;
Hoffman, Judy ;
Mori, Greg ;
Li Fei-Fei .
2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, :1568-1576
[39]  
Shi XJ, 2015, ADV NEUR IN, V28
[40]   Dynamic Edge-Conditioned Filters in Convolutional Neural Networks on Graphs [J].
Simonovsky, Martin ;
Komodakis, Nikos .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :29-38