Zero-Shot Human-Object Interaction Detection via Similarity Propagation

被引:4
|
作者
Zong, Daoming [1 ]
Sun, Shiliang [1 ,2 ,3 ]
机构
[1] East China Normal Univ, Sch Comp Sci & Technol, Shanghai 200062, Peoples R China
[2] East China Normal Univ, Key Lab Adv Theory & Applicat Stat & Data Sci, Minist Educ, Shanghai 200062, Peoples R China
[3] Shanghai Jiao Tong Univ, Dept Automat, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Human-object interaction (HOI) detection; object detection; zero-shot learning (ZSL);
D O I
10.1109/TNNLS.2023.3309104
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human-object interaction (HOI) detection involves identifying interactions represented as < human, action, object >, requiring the localization of human-object pairs and interaction classification within an image. This work focuses on the challenge of detecting HOIs with unseen objects using the prevalent Transformer architecture. Our empirical analysis reveals that the performance degradation of novel HOI instances primarily arises from misclassifying unseen objects as confusable seen objects. To address this issue, we propose a similarity propagation (SP) scheme that leverages cosine similarity distance to regulate the prediction margin between seen and unseen objects. In addition, we introduce pseudo-supervision for unseen objects based on class semantic similarities during training. Furthermore, we incorporate semantic-aware instance-level and interaction-level contrastive losses with Transformer to enhance intraclass compactness and interclass separability, resulting in improved visual representations. Extensive experiments on two challenging benchmarks, V-COCO and HICO-DET, demonstrate the effectiveness of our model, outperforming current state-of-the-art methods under various zero-shot settings.
引用
收藏
页码:17805 / 17816
页数:12
相关论文
共 50 条
  • [41] Visual Language Based Succinct Zero-Shot Object Detection
    Zheng, Ye
    Huang, Xi
    Cui, Li
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 5410 - 5418
  • [42] An Improved Human-Object Interaction Detection Network
    Gao, Song
    Wang, Hongyu
    Song, Jilai
    Xu, Fang
    Zou, Fengshan
    PROCEEDINGS OF 2019 IEEE 13TH INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY, AND IDENTIFICATION (IEEE-ASID'2019), 2019, : 192 - 196
  • [43] Zero-shot Object Detection Based on Dynamic Semantic Vectors
    Li, Haoyu
    Mei, Jilin
    Zhou, Jiancong
    Hu, Yu
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, : 9267 - 9273
  • [44] Zero-shot object rumor detection based on contrastive learning
    Chen, Ke
    Zhang, Wenhao
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2024, 58 (09): : 1790 - 1800
  • [45] Zero-Shot Stance Detection via Contrastive Learning
    Liang, Bin
    Chen, Zixiao
    Gui, Lin
    He, Yulan
    Yang, Min
    Xu, Ruifeng
    PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 2738 - 2747
  • [46] Distance Matters in Human-Object Interaction Detection
    Wang, Guangzhi
    Guo, Yangyang
    Wong, Yongkang
    Kankanhalli, Mohan
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4546 - 4554
  • [47] Human-object interaction detection with missing objects
    Kogashi, Kaen
    Wu, Yang
    Nobuhara, Shohei
    Nishino, Ko
    IMAGE AND VISION COMPUTING, 2021, 113
  • [48] Zero-Shot Anomaly Detection via Batch Normalization
    Li, Aodong
    Qiu, Chen
    Kloft, Marius
    Smyth, Padhraic
    Rudolph, Maja
    Mandt, Stephan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [49] Agglomerative Transformer for Human-Object Interaction Detection
    Tu, Danyang
    Sun, Wei
    Zhai, Guangtao
    Shen, Wei
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21557 - 21567
  • [50] Diagnosing Rarity in Human-object Interaction Detection
    Kilickaya, Mert
    Smeulders, Arnold
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 3956 - 3960