ACP plus plus : Action Co-Occurrence Priors for Human-Object Interaction Detection

被引:11
作者
Kim, Dong-Jin [1 ]
Sun, Xiao [2 ]
Choi, Jinsoo [1 ]
Lin, Stephen [2 ]
Kweon, In So [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Sch Elect & Comp Engn, Daejeon 34141, South Korea
[2] Microsoft Res Asia, Visual Comp Grp, Beijing 100080, Peoples R China
关键词
Visualization; Training; Task analysis; Bicycles; Semantics; Context modeling; Benchmark testing; Human-object interaction; visual relationship; co-occurrence; label hierarchy; knowledge distillation; CLASSIFICATION;
D O I
10.1109/TIP.2021.3113563
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A common problem in the task of human-object interaction (HOI) detection is that numerous HOI classes have only a small number of labeled examples, resulting in training sets with a long-tailed distribution. The lack of positive labels can lead to low classification accuracy for these classes. Towards addressing this issue, we observe that there exist natural correlations and anti-correlations among human-object interactions. In this paper, we model the correlations as action co-occurrence matrices and present techniques to learn these priors and leverage them for more effective training, especially on rare classes. The efficacy of our approach is demonstrated experimentally, where the performance of our approach consistently improves over the state-of-the-art methods on both of the two leading HOI detection benchmark datasets, HICO-Det and V-COCO.
引用
收藏
页码:9150 / 9163
页数:14
相关论文
共 101 条
[21]  
Gupta Saurabh, 2015, CoRR abs/1505.04474
[22]  
Gupta T., No-Frills Pytorch Github
[23]   No-Frills Human-Object Interaction Detection: Factorization, Layout Encodings, and Training Techniques [J].
Gupta, Tanmay ;
Schwing, Alexander ;
Hoiem, Derek .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9676-9684
[24]   Contextual Heterogeneous Graph Network for Human-Object Interaction Detection [J].
Hai Wang ;
Zheng, Wei-shi ;
Ling Yingbiao .
COMPUTER VISION - ECCV 2020, PT XVII, 2020, 12362 :248-264
[25]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[26]  
Hinton Geoffrey E., 2015, ARXIV
[27]   Learning with Side Information through Modality Hallucination [J].
Hoffman, Judy ;
Gupta, Saurabh ;
Darrell, Trevor .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :826-834
[28]   Significance of Softmax-Based Features in Comparison to Distance Metric Learning-Based Features [J].
Horiguchi, Shota ;
Ikami, Daiki ;
Aizawa, Kiyoharu .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (05) :1279-1285
[29]   Visual Compositional Learning for Human-Object Interaction Detection [J].
Hou, Zhi ;
Peng, Xiaojiang ;
Qiao, Yu ;
Tao, Dacheng .
COMPUTER VISION - ECCV 2020, PT XV, 2020, 12360 :584-600
[30]   Harnessing Deep Neural Networks with Logic Rules [J].
Hu, Zhiting ;
Ma, Xuezhe ;
Liu, Zhengzhong ;
Hovy, Eduard ;
Xing, Eric P. .
PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, :2410-2420