Content-Attribute Disentanglement for Generalized Zero-Shot Learning

被引:2
作者
An, Yoojin [1 ]
Kim, Sangyeon [2 ]
Liang, Yuxuan [3 ]
Zimmermann, Roger [3 ]
Kim, Dongho [4 ]
Kim, Jihie [1 ]
机构
[1] Dongguk Univ, Dept Artificial Intelligence, Seoul 04620, South Korea
[2] Naver Webtoon AI, Seongnam 13529, South Korea
[3] Natl Univ Singapore, Sch Comp, Singapore 119077, Singapore
[4] Dongguk Univ, Dongguk Inst Convergence Educ, Seoul 04620, South Korea
关键词
Visualization; Prototypes; Feature extraction; Codes; Training; Semantics; Task analysis; Computer vision; deep learning; disentangled representation; generalized zero-shot learning;
D O I
10.1109/ACCESS.2022.3178800
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Humans can recognize or infer unseen classes of objects using descriptions explaining the characteristics (semantic information) of the classes. However, conventional deep learning models trained in a supervised manner cannot classify classes that were unseen during training. Hence, many studies have been conducted into generalized zero-shot learning (GZSL), which aims to produce system which can recognize both seen and unseen classes, by transferring learned knowledge from seen to unseen classes. Since seen and unseen classes share a common semantic space, extracting appropriate semantic information from images is essential for GZSL. In addition to semantic-related information (attributes), images also contain semantic-unrelated information (contents), which can degrade the classification performance of the model. Therefore, we propose a content-attribute disentanglement architecture which separates the content and attribute information of images. The proposed method is comprised of three major components: 1) a feature generation module for synthesizing unseen visual features; 2) a content-attribute disentanglement module for discriminating content and attribute codes from images; and 3) an attribute comparator module for measuring the compatibility between the attribute codes and the class prototypes which act as the ground truth. With extensive experiments, we show that our method achieves state-of-the-art and competitive results on four benchmark datasets in GZSL. Our method also outperforms the existing zero-shot learning methods in all of the datasets. Moreover, our method has the best accuracy as well in a zero-shot retrieval task. Our code is available at https://github.com/anyoojin1996/CA-GZSL.
引用
收藏
页码:58320 / 58331
页数:12
相关论文
共 53 条
[1]   Label-Embedding for Attribute-Based Classification [J].
Akata, Zeynep ;
Perronnin, Florent ;
Harchaoui, Zaid ;
Schmid, Cordelia .
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :819-826
[2]   Optuna: A Next-generation Hyperparameter Optimization Framework [J].
Akiba, Takuya ;
Sano, Shotaro ;
Yanase, Toshihiko ;
Ohta, Takeru ;
Koyama, Masanori .
KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, :2623-2631
[3]   Preserving Semantic Relations for Zero-Shot Learning [J].
Annadani, Yashas ;
Biswas, Soma .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7603-7612
[4]   Predicting Deep Zero-Shot Convolutional Neural Networks using Textual Descriptions [J].
Ba, Jimmy Lei ;
Swersky, Kevin ;
Fidler, Sanja ;
Salakhutdinov, Ruslan .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4247-4255
[5]   Synthesized Classifiers for Zero-Shot Learning [J].
Changpinyo, Soravit ;
Chao, Wei-Lun ;
Gong, Boqing ;
Sha, Fei .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :5327-5336
[6]  
Chen Z., 2021, ICCV, P8712
[7]  
Chou Y.-Y., 2020, PROC INT C LEARN REP, P1
[8]   Fine-Grained Generalized Zero-Shot Learning via Dense Attribute-Based Attention [J].
Dat Huynh ;
Elhamifar, Ehsan .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4482-4492
[9]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[10]  
Farhadi A, 2009, PROC CVPR IEEE, P1778, DOI 10.1109/CVPRW.2009.5206772