Contextual Heterogeneous Graph Network for Human-Object Interaction Detection

被引:43
作者
Hai Wang [1 ,2 ]
Zheng, Wei-shi [1 ,2 ]
Ling Yingbiao [1 ,3 ]
机构
[1] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou, Peoples R China
[2] Minist Educ, Key Lab Machine Intelligence & Adv Comp, Guangzhou, Peoples R China
[3] Peng Cheng Lab, Shenzhen 518005, Peoples R China
来源
COMPUTER VISION - ECCV 2020, PT XVII | 2020年 / 12362卷
关键词
Human-object interaction; Heterogeneous graph; Neural network;
D O I
10.1007/978-3-030-58520-4_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human-object interaction (HOI) detection is an important task for understanding human activity. Graph structure is appropriate to denote the HOIs in the scene. Since there is an subordination between human and object-human play subjective role and object play objective role in HOI, the relations between homogeneous entities and heterogeneous entities in the scene should also not be equally the same. However, previous graph models regard human and object as the same kind of nodes and do not consider that the messages are not equally the same between different entities. In this work, we address such a problem for HOI task by proposing a heterogeneous graph network that models humans and objects as different kinds of nodes and incorporates intra-class messages between homogeneous nodes and inter-class messages between heterogeneous nodes. In addition, a graph attention mechanism based on the intra-class context and inter-class context is exploited to improve the learning. Extensive experiments on the benchmark datasets V-COCO and HICO-DET verify the effectiveness of our method and demonstrate the importance to extract intra-class and interclass messages which are not equally the same in HOI detection.
引用
收藏
页码:248 / 264
页数:17
相关论文
共 39 条
[1]   Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].
Cao, Zhe ;
Simon, Tomas ;
Wei, Shih-En ;
Sheikh, Yaser .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310
[2]   Learning to Detect Human-Object Interactions [J].
Chao, Yu-Wei ;
Liu, Yunfan ;
Liu, Xieyang ;
Zeng, Huayi ;
Deng, Jia .
2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, :381-389
[3]   HICO: A Benchmark for Recognizing Human-Object Interactions in Images [J].
Chao, Yu-Wei ;
Wang, Zhan ;
He, Yugeng ;
Wang, Jiaxuan ;
Deng, Jia .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1017-1025
[4]   Multi-Context Attention for Human Pose Estimation [J].
Chu, Xiao ;
Yang, Wei ;
Ouyang, Wanli ;
Ma, Cheng ;
Yuille, Alan L. ;
Wang, Xiaogang .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5669-5678
[5]  
Delaitre Vincent, 2011, NIPS
[6]  
Desai C, 2010, 2010 IEEE COMP VIS P, P9
[7]  
Feng W, 2019, Arxiv, DOI arXiv:1903.06355
[8]  
Gao C, 2018, arXiv
[9]  
Gilmer J, 2017, PR MACH LEARN RES, V70
[10]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448