Weakly Supervised Referring Expression Grounding via Dynamic Self-Knowledge Distillation

被引:1
|
作者
Mi, Jinpeng [1 ]
Chen, Zhiqian [1 ]
Zhang, Jianwei [2 ]
机构
[1] Univ Shanghai Sci & Technol, Inst Machine Intelligence IMI, Shanghai, Peoples R China
[2] Univ Hamburg, Dept Informat, Tech Aspects Multimodal Syst TAMS, Hamburg, Germany
来源
2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS | 2023年
基金
美国国家科学基金会;
关键词
RECONSTRUCTION;
D O I
10.1109/IROS55552.2023.10341909
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Weakly supervised referring expression grounding (WREG) is an attractive and challenging task for grounding target regions in images by understanding given referring expressions. WREG learns to ground target objects without the manual annotations between image regions and referring expressions during the model training phase. Different from the predominant grounding pattern of existing models, which locates target objects by reconstructing the region-expression correspondence, we investigate WREG from a novel perspective and enrich the prevailing pattern with self-knowledge distillation. Specifically, we propose a target-guided self-knowledge distillation approach that adopts the target prediction knowledge learned from the previous training iterations as the teacher to guide the subsequent training procedure. In order to avoid the misleading caused by the teacher knowledge with low prediction confidence, we present an uncertainty-aware knowledge refinement strategy to adaptively rectify the teacher knowledge by learning dynamic threshold values based on the model prediction uncertainty. To validate the proposed approach, we implement extensive experiments on three benchmark datasets, i.e., RefCOCO, RefCOCO+, and RefCOCOg. Our approach achieves new state-of-the-art results on several splits of the benchmark datasets, showcasing the advantage of the proposed framework for WREG. The implementation codes and trained models are available at: https://github.com/dami23/WREG Self KD.
引用
收藏
页码:1254 / 1260
页数:7
相关论文
共 50 条
  • [21] Discriminative Triad Matching and Reconstruction for Weakly Referring Expression Grounding
    Sun, Mingjie
    Xiao, Jimin
    Lim, Eng Gee
    Liu, Si
    Goulermas, John Y.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (11) : 4189 - 4195
  • [22] Knowledge Aided Consistency for Weakly Supervised Phrase Grounding
    Chen, Kan
    Gao, Jiyang
    Nevatia, Ram
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4042 - 4050
  • [23] Sliding Cross Entropy for Self-Knowledge Distillation
    Lee, Hanbeen
    Kim, Jeongho
    Woo, Simon S.
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 1044 - 1053
  • [24] Self-Knowledge Distillation with Progressive Refinement of Targets
    Kim, Kyungyul
    Ji, ByeongMoon
    Yoon, Doyoung
    Hwang, Sangheum
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6547 - 6556
  • [25] Self-knowledge distillation for surgical phase recognition
    Zhang, Jinglu
    Barbarisi, Santiago
    Kadkhodamohammadi, Abdolrahim
    Stoyanov, Danail
    Luengo, Imanol
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2024, 19 (01) : 61 - 68
  • [26] Diversified branch fusion for self-knowledge distillation
    Long, Zuxiang
    Ma, Fuyan
    Sun, Bin
    Tan, Mingkui
    Li, Shutao
    INFORMATION FUSION, 2023, 90 : 12 - 22
  • [27] Noisy Self-Knowledge Distillation for Text Summarization
    Liu, Yang
    Shen, Sheng
    Lapata, Mirella
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 692 - 703
  • [28] Self-knowledge distillation for surgical phase recognition
    Jinglu Zhang
    Santiago Barbarisi
    Abdolrahim Kadkhodamohammadi
    Danail Stoyanov
    Imanol Luengo
    International Journal of Computer Assisted Radiology and Surgery, 2024, 19 : 61 - 68
  • [29] Refine Myself by Teaching Myself : Feature Refinement via Self-Knowledge Distillation
    Ji, Mingi
    Shin, Seungjae
    Hwang, Seunghyun
    Park, Gibeom
    Moon, Il-Chul
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 10659 - 10668
  • [30] Weakly Supervised Cross-lingual Semantic Relation Classification via Knowledge Distillation
    Vyas, Yogarshi
    Carpuat, Marine
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5285 - 5296