Visual Structure Constraint for Transductive Zero-Shot Learning in the Wild

被引:5
作者
Wan, Ziyu [1 ]
Chen, Dongdong [2 ]
Liao, Jing [1 ]
机构
[1] City Univ Hong Kong, Kowloon, Hong Kong, Peoples R China
[2] Microsoft Cloud AI, Lexington, KY USA
关键词
Computer vision; Zero-shot learning; Visual structure constraint;
D O I
10.1007/s11263-021-01451-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To recognize objects of the unseen classes, most existing Zero-Shot Learning(ZSL) methods first learn a compatible projection function between the common semantic space and the visual space based on the data of source seen classes, then directly apply it to the target unseen classes. However, for data in the wild, distributions between the source and target domain might not match well, thus causing the well-known domain shift problem. Based on the observation that visual features of test instances can be separated into different clusters, we propose a new visual structure constraint on class centers for transductive ZSL, to improve the generality of the projection function (i.e.alleviate the above domain shift problem). Specifically, three different strategies (symmetric Chamfer-distance, Bipartite matching distance, and Wasserstein distance) are adopted to align the projected unseen semantic centers and visual cluster centers of test instances. We also propose two new training strategies to handle the data in the wild, where many unrelated images in the test dataset may exist. This realistic setting has never been considered in previous methods. Extensive experiments demonstrate that the proposed visual structure constraint brings substantial performance gain consistently and the new training strategies make it generalize well for data in the wild. The source code is available at https:// github.com/ raywzy/VSC..
引用
收藏
页码:1893 / 1909
页数:17
相关论文
共 62 条
[1]   Label-Embedding for Image Classification [J].
Akata, Zeynep ;
Perronnin, Florent ;
Harchaoui, Zaid ;
Schmid, Cordelia .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (07) :1425-1438
[2]  
Akata Z, 2015, PROC CVPR IEEE, P2927, DOI 10.1109/CVPR.2015.7298911
[3]   Label-Embedding for Attribute-Based Classification [J].
Akata, Zeynep ;
Perronnin, Florent ;
Harchaoui, Zaid ;
Schmid, Cordelia .
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :819-826
[4]  
al, 2013, NIPS
[5]  
al, 2015, ICCV
[6]   Preserving Semantic Relations for Zero-Shot Learning [J].
Annadani, Yashas ;
Biswas, Soma .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7603-7612
[7]  
Arjovsky M., 2017, ARXIV170107875
[8]  
Chang JL, 2017, IEEE I CONF COMP VIS, P5880, DOI [10.1109/ICCV.2017.627, 10.1109/ICCV.2017.626]
[9]   Classifier and Exemplar Synthesis for Zero-Shot Learning [J].
Changpinyo, Soravit ;
Chao, Wei-Lun ;
Gong, Boqing ;
Sha, Fei .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (01) :166-201
[10]   Predicting Visual Exemplars of Unseen Classes for Zero-Shot Learning [J].
Changpinyo, Soravit ;
Chao, Wei-Lun ;
Sha, Fei .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3496-3505