ACP plus plus : Action Co-Occurrence Priors for Human-Object Interaction Detection

被引：11

作者：

Kim, Dong-Jin ^{[1
]}

Sun, Xiao ^{[2
]}

Choi, Jinsoo ^{[1
]}

Lin, Stephen ^{[2
]}

Kweon, In So ^{[1
]}

机构：

[1] Korea Adv Inst Sci & Technol, Sch Elect & Comp Engn, Daejeon 34141, South Korea

[2] Microsoft Res Asia, Visual Comp Grp, Beijing 100080, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2021年 / 30卷

关键词：

Visualization; Training; Task analysis; Bicycles; Semantics; Context modeling; Benchmark testing; Human-object interaction; visual relationship; co-occurrence; label hierarchy; knowledge distillation; CLASSIFICATION;

D O I：

10.1109/TIP.2021.3113563

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A common problem in the task of human-object interaction (HOI) detection is that numerous HOI classes have only a small number of labeled examples, resulting in training sets with a long-tailed distribution. The lack of positive labels can lead to low classification accuracy for these classes. Towards addressing this issue, we observe that there exist natural correlations and anti-correlations among human-object interactions. In this paper, we model the correlations as action co-occurrence matrices and present techniques to learn these priors and leverage them for more effective training, especially on rare classes. The efficacy of our approach is demonstrated experimentally, where the performance of our approach consistently improves over the state-of-the-art methods on both of the two leading HOI detection benchmark datasets, HICO-Det and V-COCO.

引用

页码：9150 / 9163

页数：14

共 101 条

[1]

Arnab A., 2017, Proceedings of the IEEE conference on computer vision and pattern recognition, P524, DOI DOI 10.1109/CVPR.2017.472

[2]

Bansal A, 2020, AAAI CONF ARTIF INTE, V34, P10460

[3] Object Level Visual Reasoning in Videos [J].

Baradel, Fabien ;

Neverova, Natalia ;

Wolf, Christian ;

Mille, Julien ;

Mori, Greg .

COMPUTER VISION - ECCV 2018, PT XIII, 2018, 11217 :106-122

[4] Learning to Detect Human-Object Interactions [J].

Chao, Yu-Wei ;

Liu, Yunfan ;

Liu, Xieyang ;

Zeng, Huayi ;

Deng, Jia .

2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, :381-389

[5] HICO: A Benchmark for Recognizing Human-Object Interactions in Images [J].

Chao, Yu-Wei ;

Wang, Zhan ;

He, Yugeng ;

Wang, Jiaxuan ;

Deng, Jia .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1017-1025

[6] Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task through Knowledge Distillation [J].

Cho, Jae Won ;

Kim, Dong-Jin ;

Choi, Jinsoo ;

Jung, Yunjae ;

Kweon, In So .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, :1592-1601

[7] Detecting Visual Relationships with Deep Relational Networks [J].

Dai, Bo ;

Zhang, Yuqi ;

Lin, Dahua .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3298-3308

[8]

Delaitre V., 2010, BMVC, P971, DOI [10.5244/C.24.97, DOI 10.5244/C.24.97]

[9]

Delaitre V, 2012, LECT NOTES COMPUT SC, V7577, P284, DOI 10.1007/978-3-642-33783-3_21

[10]

Delaitre Vincent, 2011, NIPS

← 1 2 3 4 5 6 7 8 9 10 →