Pairwise Body-Part Attention for Recognizing Human-Object Interactions

被引:91
作者
Fang, Hao-Shu [1 ]
Cao, Jinkun [1 ]
Tai, Yu-Wing [2 ]
Lu, Cewu [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Tencent YouTu Lab, Shanghai, Peoples R China
来源
COMPUTER VISION - ECCV 2018, PT X | 2018年 / 11214卷
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Human-object interactions; Body-part correlations; Attention model; ACTION RECOGNITION;
D O I
10.1007/978-3-030-01249-6_4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In human-object interactions (HOI) recognition, conventional methods consider the human body as a whole and pay a uniform attention to the entire body region. They ignore the fact that normally, human interacts with an object by using some parts of the body. In this paper, we argue that different body parts should be paid with different attention in HOI recognition, and the correlations between different body parts should be further considered. This is because our body parts always work collaboratively. We propose a new pairwise body-part attention model which can learn to focus on crucial parts, and their correlations for HOI recognition. A novel attention based feature selection method and a feature representation scheme that can capture pairwise correlations between body parts are introduced in the model. Our proposed approach achieved 10% relative improvement (36.1mAP -> 39.9mAP) over the state-of-the-art results in HOI recognition on the HICO dataset. We will make our model and source codes publicly available.
引用
收藏
页码:52 / 68
页数:17
相关论文
共 56 条
[31]   Coloring Action Recognition in Still Images [J].
Khan, Fahad Shahbaz ;
Anwer, Rao Muhammad ;
van de Weijer, Joost ;
Bagdanov, Andrew D. ;
Lopez, Antonio M. ;
Felsberg, Michael .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2013, 105 (03) :205-221
[32]  
Larochelle H., 2010, ADV NEURIPS
[33]   Global Context-Aware Attention LSTM Networks for 3D Action Recognition [J].
Liu, Jun ;
Wang, Gang ;
Hu, Ping ;
Duan, Ling-Yu ;
Kot, Alex C. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3671-3680
[34]  
Maji S., 2011, CVPR, P3177
[35]   Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering [J].
Mallya, Arun ;
Lazebnik, Svetlana .
COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 :414-428
[36]  
Maron O, 1998, ADV NEUR IN, V10, P570
[37]  
Mnih V, 2014, ADV NEUR IN, V27
[38]   Weakly Supervised Learning of Interactions between Humans and Objects [J].
Prest, Alessandro ;
Schmid, Cordelia ;
Ferrari, Vittorio .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (03) :601-614
[39]  
Qadir O, 2011, IEEE C EVOL COMPUTAT, P208
[40]  
Ramanathan V, 2015, PROC CVPR IEEE, P1100, DOI 10.1109/CVPR.2015.7298713