PSVMA plus : Exploring Multi-Granularity Semantic-Visual Adaption for Generalized Zero-Shot Learning

被引:0
|
作者
Liu, Man [1 ]
Bai, Huihui [1 ]
Li, Feng [2 ]
Zhang, Chunjie [1 ]
Wei, Yunchao [1 ]
Wang, Meng [2 ]
Chua, Tat-Seng [3 ]
Zhao, Yao [1 ]
机构
[1] Beijing Jiaotong Univ, Inst Informat Sci, Beijing Key Lab Adv Informat Sci & Network Technol, Beijing 100082, Peoples R China
[2] Hefei Univ Technol, Hefei 230002, Peoples R China
[3] Natl Univ Singapore, Singapore 119077, Singapore
基金
北京市自然科学基金; 国家重点研发计划; 中国国家自然科学基金;
关键词
multi-granularity; Zero-shot learning; semantic-visual interactions;
D O I
10.1109/TPAMI.2024.3467229
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generalized zero-shot learning (GZSL) endeavors to identify the unseen categories using knowledge from the seen domain, necessitating the intrinsic interactions between the visual features and attribute semantic features. However, GZSL suffers from insufficient visual-semantic correspondences due to the attribute diversity and instance diversity. Attribute diversity refers to varying semantic granularity in attribute descriptions, ranging from low-level (specific, directly observable) to high-level (abstract, highly generic) characteristics. This diversity challenges the collection of adequate visual cues for attributes under a uni-granularity. Additionally, diverse visual instances corresponding to the same sharing attributes introduce semantic ambiguity, leading to vague visual patterns. To tackle these problems, we propose a multi-granularity progressive semantic-visual mutual adaption (PSVMA+) network, where sufficient visual elements across granularity levels can be gathered to remedy the granularity inconsistency. PSVMA+ explores semantic-visual interactions at different granularity levels, enabling awareness of multi-granularity in both visual and semantic elements. At each granularity level, the dual semantic-visual transformer module (DSVTM) recasts the sharing attributes into instance-centric attributes and aggregates the semantic-related visual regions, thereby learning unambiguous visual features to accommodate various instances. Given the diverse contributions of different granularities, PSVMA+ employs selective cross-granularity learning to leverage knowledge from reliable granularities and adaptively fuses multi-granularity features for comprehensive representations. Experimental results demonstrate that PSVMA+ consistently outperforms state-of-the-art methods.
引用
收藏
页码:51 / 66
页数:16
相关论文
共 50 条
  • [31] Visual-Semantic Graph Matching Net for Zero-Shot Learning
    Duan, Bowen
    Chen, Shiming
    Guo, Yufei
    Xie, Guo-Sen
    Ding, Weiping
    Wang, Yisong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [32] Zero-shot learning via visual-semantic aligned autoencoder
    Wei, Tianshu
    Huang, Jinjie
    Jin, Cong
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (08) : 14081 - 14095
  • [33] SEMANTIC MANIFOLD ALIGNMENT IN VISUAL FEATURE SPACE FOR ZERO-SHOT LEARNING
    Liao, Changsu
    Su, Li
    Zhang, Wegang
    Huang, Qingming
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [34] Visual-Semantic Aligned Bidirectional Network for Zero-Shot Learning
    Gao, Rui
    Hou, Xingsong
    Qin, Jie
    Shen, Yuming
    Long, Yang
    Liu, Li
    Zhang, Zhao
    Shao, Ling
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1649 - 1664
  • [35] Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An Overview
    Ren, Wenqi
    Tang, Yang
    Sun, Qiyu
    Zhao, Chaoqiang
    Han, Qing-Long
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2024, 11 (05) : 1106 - 1126
  • [36] Semantic-aware visual attributes learning for zero-shot recognition
    Xie, Yurui
    Song, Tiecheng
    Li, Wei
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 74 (74)
  • [37] Generalized Zero-Shot Extreme Multi-label Learning
    Gupta, Nilesh
    Bohra, Sakina
    Prabhu, Yashoteja
    Purohit, Saurabh
    Varma, Manik
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 527 - 535
  • [38] Semantic-aware visual attributes learning for zero-shot recognition
    Xie, Yurui
    Song, Tiecheng
    Li, Wei
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 74
  • [39] Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An Overview
    Wenqi Ren
    Yang Tang
    Qiyu Sun
    Chaoqiang Zhao
    Qing-Long Han
    IEEE/CAA Journal of Automatica Sinica, 2024, 11 (05) : 1106 - 1126
  • [40] Semantic-aware visual attributes learning for zero-shot recognition
    Xie, Yurui
    Song, Tiecheng
    Li, Wei
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 74