Zero-Shot Transfer Learning Based on Visual and Textual Resemblance

被引:2
|
作者
Yang, Gang [1 ]
Xu, Jieping [1 ]
机构
[1] Renmin Univ China, Key Lab Data Engn & Knowledge Engn, Beijing, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Transfer learning; Zero-shot learning; Deep learning;
D O I
10.1007/978-3-030-36718-3_30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing image search engines, whose ranking functions are built based on labeled images or wrap texts, have poor results on queries in new, or low-frequency keywords. In this paper, we put forward the zero-shot transfer learning (ZSTL), which aims to transfer networks from given classifiers to new zero-shot classifiers with little cost, and helps image searching perform better on new or low-frequency words. Content-based queries (i.e., ranking images was not only based on their visual looks but also depended on their contents) can also be enhanced by ZSTL. ZSTL was proposed after we found the resemblance between photographic composition and the description of objects in natural language. Both composition and description highlight the object by stressing the particularity, so we consider that there exists a resemblance between visual and textual space. We provide several ways to transfer from visual features into textual ones. The method of applying deep learning and Word2Vec models to Wikipedia yielded impressive results. Our experiments present evidence to support the existence of resemblance between composition and description and show the feasibility and effectiveness of transferring zero-shot classifiers. With these transferred zero-shot classifiers, problems of image ranking query with low-frequency or new words can be solved. The image search engine proposed adopts cosine distance ranking as the ranking algorithm. Experiments on image searching show the superior performance of ZSTL.
引用
收藏
页码:353 / 362
页数:10
相关论文
共 50 条
  • [31] PLSA-BASED ZERO-SHOT LEARNING
    Hoo, Wai Lam
    Chan, Chee Seng
    2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 4297 - 4301
  • [32] Zero-Shot Learning Based on Knowledge Sharing
    Zeng, Ting
    Xiang, Hongxin
    Xie, Cheng
    Yang, Yun
    Liu, Qing
    PROCEEDINGS OF THE 2021 IEEE 24TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2021, : 643 - 648
  • [33] Learning semantic consistency for audio-visual zero-shot learning
    Xiaoyong Li
    Jing Yang
    Yuling Chen
    Wei Zhang
    Xiaoli Ruan
    Chengjiang Li
    Zhidong Su
    Artificial Intelligence Review, 58 (7)
  • [34] Predicting Visual Exemplars of Unseen Classes for Zero-Shot Learning
    Changpinyo, Soravit
    Chao, Wei-Lun
    Sha, Fei
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 3496 - 3505
  • [35] Contrastive visual feature filtering for generalized zero-shot learning
    Meng, Shixuan
    Jiang, Rongxin
    Tian, Xiang
    Zhou, Fan
    Chen, Yaowu
    Liu, Junjie
    Shen, Chen
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024,
  • [36] Dynamic visual-guided selection for zero-shot learning
    Zhou, Yuan
    Xiang, Lei
    Liu, Fan
    Duan, Haoran
    Long, Yang
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (03): : 4401 - 4419
  • [37] Disentangling Semantic-to-Visual Confusion for Zero-Shot Learning
    Ye, Zihan
    Hu, Fuyuan
    Lyu, Fan
    Li, Linyan
    Huang, Kaizhu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2828 - 2840
  • [38] Learning discriminative visual semantic embedding for zero-shot recognition
    Xie, Yurui
    Song, Tiecheng
    Yuan, Jianying
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 115
  • [39] Visual Structure Constraint for Transductive Zero-Shot Learning in the Wild
    Ziyu Wan
    Dongdong Chen
    Jing Liao
    International Journal of Computer Vision, 2021, 129 : 1893 - 1909
  • [40] Hierarchical Visual Primitive Experts for Compositional Zero-Shot Learning
    Kim, Hanjae
    Lee, Jiyoung
    Park, Seongheon
    Sohn, Kwanghoon
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 5652 - 5662