Human–Object Interaction Detection: An Overview

被引:1
作者
Wang, Jia [1 ]
Shuai, Hong-Han [2 ]
Li, Yung-Hui [3 ]
Cheng, Wen-Huang [4 ]
机构
[1] Guangdong Pharmaceut Univ, Guangzhou, Peoples R China
[2] Natl Yang Ming Chiao Tung Univ, Dept Elect & Comp Engn, Hsinchu, Taiwan
[3] Hon Hai Res Inst, Artificial Intelligence Res Ctr, Taipei, Taiwan
[4] Natl Taiwan Univ, Dept Comp Sci & Informat Engn, Taipei, Taiwan
关键词
Feature extraction; Task analysis; Visualization; Cognition; Affordances; Convolutional neural networks; Consumer electronics;
D O I
10.1109/MCE.2023.3343919
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This article systematically summarizes and discusses recent research on image-based human-object interaction (HOI) detection, which aims to detect human-object pairs and recognize the interactive behaviors between humans and objects in an image. It has plenty of applications and can serve as the basis to assist higher level tasks of visual understanding. We introduce existing methods by categorizing them into two main groups based on the model structure: one-stage and two-stage approaches. We further divide one-stage methods into point-based, region-based, and query-based methods. Similarly, the two-stage methods are divided into HOI detection with multistream modeling, HOI detection with human parts and pose, HOI detection with compositional learning, HOI detection with graph-based modeling, and HOI detection with query-based modeling. According to this taxonomy, we also summarize and analyze the core ideas behind each strategy. Then, we present the details of the experimental protocols, evaluation metrics, datasets, and the evaluation results of the most recent representative methods. Finally, we discuss the main open challenges and future trends in the HOI detection task.
引用
收藏
页码:56 / 72
页数:17
相关论文
共 63 条
[1]  
[Anonymous], 2014, P 5 ACM MULT SYST C, DOI DOI 10.1145/2557642.2563669
[2]   Deep Learning for AI [J].
Bengio, Yoshua ;
Lecun, Yann ;
Hinton, Geoffrey .
COMMUNICATIONS OF THE ACM, 2021, 64 (07) :58-65
[3]  
Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
[4]   Learning to Detect Human-Object Interactions [J].
Chao, Yu-Wei ;
Liu, Yunfan ;
Liu, Xieyang ;
Zeng, Huayi ;
Deng, Jia .
2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, :381-389
[5]  
Chen Gao, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12357), P696, DOI 10.1007/978-3-030-58610-2_41
[6]   AP-Loss for Accurate One-Stage Object Detection [J].
Chen, Kean ;
Lin, Weiyao ;
Li, Jianguo ;
See, John ;
Wang, Ji ;
Zou, Junni .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (11) :3782-3798
[7]   Reformulating HOI Detection as Adaptive Set Prediction [J].
Chen, Mingfei ;
Liao, Yue ;
Liu, Si ;
Chen, Zhiyuan ;
Wang, Fei ;
Qian, Chen .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :9000-9009
[8]   Video adaptation for small display based on content recomposition [J].
Cheng, Wen-Huang ;
Wang, Chia-Wei ;
Wu, Ja-Ling .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2007, 17 (01) :43-58
[9]  
Cheng WH, 2021, ACM COMPUT SURV, V54, DOI [10.1145/3552468.3554360, 10.1145/3447239]
[10]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893