Weakly Supervised Referring Expression Grounding via Dynamic Self-Knowledge Distillation

被引:1
|
作者
Mi, Jinpeng [1 ]
Chen, Zhiqian [1 ]
Zhang, Jianwei [2 ]
机构
[1] Univ Shanghai Sci & Technol, Inst Machine Intelligence IMI, Shanghai, Peoples R China
[2] Univ Hamburg, Dept Informat, Tech Aspects Multimodal Syst TAMS, Hamburg, Germany
来源
2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS | 2023年
基金
美国国家科学基金会;
关键词
RECONSTRUCTION;
D O I
10.1109/IROS55552.2023.10341909
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Weakly supervised referring expression grounding (WREG) is an attractive and challenging task for grounding target regions in images by understanding given referring expressions. WREG learns to ground target objects without the manual annotations between image regions and referring expressions during the model training phase. Different from the predominant grounding pattern of existing models, which locates target objects by reconstructing the region-expression correspondence, we investigate WREG from a novel perspective and enrich the prevailing pattern with self-knowledge distillation. Specifically, we propose a target-guided self-knowledge distillation approach that adopts the target prediction knowledge learned from the previous training iterations as the teacher to guide the subsequent training procedure. In order to avoid the misleading caused by the teacher knowledge with low prediction confidence, we present an uncertainty-aware knowledge refinement strategy to adaptively rectify the teacher knowledge by learning dynamic threshold values based on the model prediction uncertainty. To validate the proposed approach, we implement extensive experiments on three benchmark datasets, i.e., RefCOCO, RefCOCO+, and RefCOCOg. Our approach achieves new state-of-the-art results on several splits of the benchmark datasets, showcasing the advantage of the proposed framework for WREG. The implementation codes and trained models are available at: https://github.com/dami23/WREG Self KD.
引用
收藏
页码:1254 / 1260
页数:7
相关论文
共 50 条
  • [41] WITTGENSTEIN ON SELF-KNOWLEDGE AND SELF-EXPRESSION
    JACOBSEN, R
    PHILOSOPHICAL QUARTERLY, 1996, 46 (182): : 12 - 30
  • [42] SELF-ENHANCED TRAINING FRAMEWORK FOR REFERRING EXPRESSION GROUNDING
    Chen, Yitao
    Du, Ruoyi
    Liang, Kongming
    Ma, Zhanyu
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3060 - 3064
  • [43] Self-Knowledge: Expression without Expressivism
    Campbell, Lucy
    PHILOSOPHY AND PHENOMENOLOGICAL RESEARCH, 2022, 104 (01) : 186 - 208
  • [44] MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition
    Yang, Chuanguang
    An, Zhulin
    Zhou, Helong
    Cai, Linhang
    Zhi, Xiang
    Wu, Jiwen
    Xu, Yongjun
    Zhang, Qian
    COMPUTER VISION, ECCV 2022, PT XXIV, 2022, 13684 : 534 - 551
  • [45] Adaptive lightweight network construction method for Self-Knowledge Distillation
    Lu, Siyuan
    Zeng, Weiliang
    Li, Xueshi
    Ou, Jiajun
    NEUROCOMPUTING, 2025, 624
  • [46] Gaze-assisted visual grounding via knowledge distillation for referred object grasping with under-specified object referring
    Zhang, Zhuoyang
    Qian, Kun
    Zhou, Bo
    Fang, Fang
    Ma, Xudong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [47] Weakly-Supervised Video Object Grounding via Causal Intervention
    Wang, Wei
    Gao, Junyu
    Xu, Changsheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3933 - 3948
  • [48] Self-Knowledge Distillation for First Trimester Ultrasound Saliency Prediction
    Gridach, Mourad
    Savochkina, Elizaveta
    Drukker, Lior
    Papageorghiou, Aris T.
    Noble, J. Alison
    SIMPLIFYING MEDICAL ULTRASOUND, ASMUS 2022, 2022, 13565 : 117 - 127
  • [49] Decoupled Feature and Self-Knowledge Distillation for Speech Emotion Recognition
    Yu, Haixiang
    Ning, Yuan
    IEEE ACCESS, 2025, 13 : 33275 - 33285
  • [50] Teaching Yourself: A Self-Knowledge Distillation Approach to Action Recognition
    Duc-Quang Vu
    Le, Ngan
    Wang, Jia-Ching
    IEEE ACCESS, 2021, 9 : 105711 - 105723