LVAR-CZSL: Learning Visual Attributes Representation for Compositional Zero-Shot Learning

被引:0
|
作者
Ma, Xingjiang [1 ]
Yang, Jing [1 ,2 ]
Lin, Jiacheng [3 ]
Zheng, Zhenzhe [4 ]
Li, Shaobo [1 ]
Hu, Bingqi [1 ]
Tang, Xianghong [1 ]
机构
[1] Guizhou Univ, State Key Lab Publ Big Data, Guiyang 550025, Peoples R China
[2] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[3] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Peoples R China
[4] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Visualization; Feature extraction; Dogs; Task analysis; Attention mechanisms; Zero-shot learning; Circuits and systems; Compositional zero-shot learning; visual attributes; objects and attributes; inter-class connectivity; OBJECTS;
D O I
10.1109/TCSVT.2024.3444782
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Compositional Zero-Shot Learning (CZSL) has been applied to various scenarios, including scene understanding, visual-language representation, and domain adaptation. Despite numerous endeavours and significant advancements, the crucial issues of fuzzy conceptualization of visual attributes and insufficient inter-class connectivity, have remained insufficiently addressed. To address these issues, we propose Learning Visual Attributes Representation for Compositional Zero-Shot Learning (LVAR-CZSL), which has the ability to learn visual attributes and inter-class dependencies. LVAR-CZSL is mainly composed of two key components: the Visual Attribute Representation Module (VARM) and the Connected Learning Module (CLM). Specifically, VARM extracts detailed attributes and object visual features from global visual features, resolving the issue of fuzzy visual attribute concepts. Moreover, CLM endows LVAR-CZSL with the capability to perceive connectivity between different attributes and objects, effectively enhancing inter-class connectivity. To establish a close connection between VARM and CLM and minimize the gap between image and text features, we introduce the composition-attribute-object Joint Scoring Function (JSF). Additionally, we propose Joint Loss Function (JLF) to optimize the learning process of VARM and CLM. The experiment results on four datasets show that LVAR-CZSL achieves state-of-the-art performance. The code is available at https://github.com/mxjmxj1/LVAR-CZSL.
引用
收藏
页码:13311 / 13323
页数:13
相关论文
共 50 条
  • [21] Robust Zero-Shot Learning with Source Attributes Noise
    Yu, Jun
    Wu, Songsong
    Wang, Lu
    Jing, Xiao-Yuan
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC), VOL 1, 2016, : 205 - 209
  • [22] Complementary Attributes: A New Clue to Zero-Shot Learning
    Xu, Xiaofeng
    Tsang, Ivor W.
    Liu, Chuancai
    IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (03) : 1519 - 1530
  • [23] Zero-Shot Learning via Visual Abstraction
    Antol, Stanislaw
    Zitnick, C. Lawrence
    Parikh, Devi
    COMPUTER VISION - ECCV 2014, PT IV, 2014, 8692 : 401 - 416
  • [24] Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An Overview
    Ren, Wenqi
    Tang, Yang
    Sun, Qiyu
    Zhao, Chaoqiang
    Han, Qing-Long
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2024, 11 (05) : 1106 - 1126
  • [25] Spherical Zero-Shot Learning
    Shen, Jiayi
    Xiao, Zehao
    Zhen, Xiantong
    Zhang, Lei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (02) : 634 - 645
  • [26] Incremental Zero-Shot Learning
    Wei, Kun
    Deng, Cheng
    Yang, Xu
    Tao, Dacheng
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (12) : 13788 - 13799
  • [27] Denoised and Dynamic Alignment Enhancement for Zero-Shot Learning
    Ge, Jiannan
    Liu, Zhihang
    Li, Pandeng
    Xie, Lingxi
    Zhang, Yongdong
    Tian, Qi
    Xie, Hongtao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 1501 - 1515
  • [28] Disentangling Semantic-to-Visual Confusion for Zero-Shot Learning
    Ye, Zihan
    Hu, Fuyuan
    Lyu, Fan
    Li, Linyan
    Huang, Kaizhu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2828 - 2840
  • [29] Learning discriminative visual semantic embedding for zero-shot recognition
    Xie, Yurui
    Song, Tiecheng
    Yuan, Jianying
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 115
  • [30] Zero-shot learning via discriminative representation extraction
    Long, Teng
    Xu, Xing
    Shen, Fumin
    Liu, Li
    Xie, Ning
    Yang, Yang
    PATTERN RECOGNITION LETTERS, 2018, 109 : 27 - 34