Weakly Supervised Referring Expression Grounding via Dynamic Self-Knowledge Distillation

被引：1

作者：

Mi, Jinpeng ^{[1
]}

Chen, Zhiqian ^{[1
]}

Zhang, Jianwei ^{[2
]}

机构：

[1] Univ Shanghai Sci & Technol, Inst Machine Intelligence IMI, Shanghai, Peoples R China

[2] Univ Hamburg, Dept Informat, Tech Aspects Multimodal Syst TAMS, Hamburg, Germany

来源：

2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS | 2023年

基金：

美国国家科学基金会;

关键词：

RECONSTRUCTION;

D O I：

10.1109/IROS55552.2023.10341909

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Weakly supervised referring expression grounding (WREG) is an attractive and challenging task for grounding target regions in images by understanding given referring expressions. WREG learns to ground target objects without the manual annotations between image regions and referring expressions during the model training phase. Different from the predominant grounding pattern of existing models, which locates target objects by reconstructing the region-expression correspondence, we investigate WREG from a novel perspective and enrich the prevailing pattern with self-knowledge distillation. Specifically, we propose a target-guided self-knowledge distillation approach that adopts the target prediction knowledge learned from the previous training iterations as the teacher to guide the subsequent training procedure. In order to avoid the misleading caused by the teacher knowledge with low prediction confidence, we present an uncertainty-aware knowledge refinement strategy to adaptively rectify the teacher knowledge by learning dynamic threshold values based on the model prediction uncertainty. To validate the proposed approach, we implement extensive experiments on three benchmark datasets, i.e., RefCOCO, RefCOCO+, and RefCOCOg. Our approach achieves new state-of-the-art results on several splits of the benchmark datasets, showcasing the advantage of the proposed framework for WREG. The implementation codes and trained models are available at: https://github.com/dami23/WREG Self KD.

引用

页码：1254 / 1260

页数：7

共 50 条

[41] WITTGENSTEIN ON SELF-KNOWLEDGE AND SELF-EXPRESSION
JACOBSEN, R
PHILOSOPHICAL QUARTERLY, 1996, 46 (182): : 12 - 30
[42] SELF-ENHANCED TRAINING FRAMEWORK FOR REFERRING EXPRESSION GROUNDING
Chen, Yitao
Du, Ruoyi
Liang, Kongming
Ma, Zhanyu
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3060 - 3064
[43] Self-Knowledge: Expression without Expressivism
Campbell, Lucy
PHILOSOPHY AND PHENOMENOLOGICAL RESEARCH, 2022, 104 (01) : 186 - 208
[44] MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition
Yang, Chuanguang
An, Zhulin
Zhou, Helong
Cai, Linhang
Zhi, Xiang
Wu, Jiwen
Xu, Yongjun
Zhang, Qian
COMPUTER VISION, ECCV 2022, PT XXIV, 2022, 13684 : 534 - 551
[45] Adaptive lightweight network construction method for Self-Knowledge Distillation
Lu, Siyuan
Zeng, Weiliang
Li, Xueshi
Ou, Jiajun
NEUROCOMPUTING, 2025, 624
[46] Gaze-assisted visual grounding via knowledge distillation for referred object grasping with under-specified object referring
Zhang, Zhuoyang
Qian, Kun
Zhou, Bo
Fang, Fang
Ma, Xudong
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
[47] Weakly-Supervised Video Object Grounding via Causal Intervention
Wang, Wei
Gao, Junyu
Xu, Changsheng
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3933 - 3948
[48] Self-Knowledge Distillation for First Trimester Ultrasound Saliency Prediction
Gridach, Mourad
Savochkina, Elizaveta
Drukker, Lior
Papageorghiou, Aris T.
Noble, J. Alison
SIMPLIFYING MEDICAL ULTRASOUND, ASMUS 2022, 2022, 13565 : 117 - 127
[49] Decoupled Feature and Self-Knowledge Distillation for Speech Emotion Recognition
Yu, Haixiang
Ning, Yuan
IEEE ACCESS, 2025, 13 : 33275 - 33285
[50] Teaching Yourself: A Self-Knowledge Distillation Approach to Action Recognition
Duc-Quang Vu
Le, Ngan
Wang, Jia-Ching
IEEE ACCESS, 2021, 9 : 105711 - 105723

← 1 2 3 4 5 →