Confusion-Based Metric Learning for Regularizing Zero-Shot Image Retrieval and Clustering

被引：2

作者：

Chen, Binghui

Deng, Weihong ^{[1
]}

Wang, Biao

Zhang, Lei ^{[2
]}

机构：

[1] Beijing Univ Posts & Telecommun, Sch Artificial Intelligence, Beijing 100876, Peoples R China

[2] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年 / 35卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Measurement; Task analysis; Training; Learning systems; Image retrieval; Head; Semantics; Confusion; generalization; image retrieval; clustering; regularization; zero-shot learning (ZSL);

D O I：

10.1109/TNNLS.2022.3185668

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep metric learning turns to be attractive in zero-shot image retrieval and clustering (ZSRC) task in which a good embedding/metric is requested such that the unseen classes can be distinguished well. Most existing works deem this "good" embedding just to be the discriminative one and race to devise the powerful metric objectives or the hard-sample mining strategies for learning discriminative deep metrics. However, in this article, we first emphasize that the generalization ability is also a core ingredient of this "good" metric and it largely affects the metric performance in zero-shot settings as a matter of fact. Then, we propose the confusion-based metric learning (CML) framework to explicitly optimize a robust metric. It is mainly achieved by introducing two interesting regularization terms, i.e., the energy confusion (EC) and diversity confusion (DC) terms. These terms daringly break away from the traditional deep metric learning idea of designing discriminative objectives and instead seek to "confuse" the learned model. These two confusion terms focus on local and global feature distribution confusions, respectively. We train these confusion terms together with the conventional deep metric objective in an adversarial manner. Although it seems weird to "confuse" the model learning, we show that our CML indeed serves as an efficient regularization framework for deep metric learning and it is applicable to various conventional metric methods. This article empirically and experimentally demonstrates the importance of learning an embedding/metric with good generalization, achieving the state-of-the-art performances on the popular CUB, CARS, Stanford Online Products, and In-Shop datasets for ZSRC tasks.

引用

页码：1884 / 1897

页数：14

共 72 条

[1]

[Anonymous], 2017, ARXIV170401285

[2]

Bansal N., 2018, Advances in Neural Information Processing Systems, P4261

[3]

Bengio Yoshua, 2013, Statistical Language and Speech Processing. First International Conference, SLSP 2013. Proceedings: LNCS 7978, P1, DOI 10.1007/978-3-642-39593-2_1

[4]

Blundell C, 2015, PR MACH LEARN RES, V37, P1613

[5] Hardness Sampling for Self-Training Based Transductive Zero-Shot Learning [J].

Bo, Liu ;

Dong, Qiulei ;

Hu, Zhanyi .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :16494-16503

[6]

Chen B., 2018, ARXIV180600974

[7] Hybrid-Attention based Decoupled Metric Learning for Zero-Shot Image Retrieval [J].

Chen, Binghui ;

Deng, Weihong .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2745-2754

[8]

Chen BH, 2019, AAAI CONF ARTIF INTE, P8134

[9] Noisy Softmax: Improving the Generalization Ability of DCNN via Postponing the Early Softmax Saturation [J].

Chen, Binghui ;

Deng, Weihong ;

Du, Junping .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4021-4030

[10] ABD-Net: Attentive but Diverse Person Re-Identification [J].

Chen, Tianlong ;

Ding, Shaojin ;

Xie, Jingyi ;

Yuan, Ye ;

Chen, Wuyang ;

Yang, Yang ;

Ren, Zhou ;

Wang, Zhangyang .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :8350-8360

← 1 2 3 4 5 6 7 8 →