PSVMA plus : Exploring Multi-Granularity Semantic-Visual Adaption for Generalized Zero-Shot Learning

被引:0
|
作者
Liu, Man [1 ]
Bai, Huihui [1 ]
Li, Feng [2 ]
Zhang, Chunjie [1 ]
Wei, Yunchao [1 ]
Wang, Meng [2 ]
Chua, Tat-Seng [3 ]
Zhao, Yao [1 ]
机构
[1] Beijing Jiaotong Univ, Inst Informat Sci, Beijing Key Lab Adv Informat Sci & Network Technol, Beijing 100082, Peoples R China
[2] Hefei Univ Technol, Hefei 230002, Peoples R China
[3] Natl Univ Singapore, Singapore 119077, Singapore
基金
北京市自然科学基金; 国家重点研发计划; 中国国家自然科学基金;
关键词
multi-granularity; Zero-shot learning; semantic-visual interactions;
D O I
10.1109/TPAMI.2024.3467229
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generalized zero-shot learning (GZSL) endeavors to identify the unseen categories using knowledge from the seen domain, necessitating the intrinsic interactions between the visual features and attribute semantic features. However, GZSL suffers from insufficient visual-semantic correspondences due to the attribute diversity and instance diversity. Attribute diversity refers to varying semantic granularity in attribute descriptions, ranging from low-level (specific, directly observable) to high-level (abstract, highly generic) characteristics. This diversity challenges the collection of adequate visual cues for attributes under a uni-granularity. Additionally, diverse visual instances corresponding to the same sharing attributes introduce semantic ambiguity, leading to vague visual patterns. To tackle these problems, we propose a multi-granularity progressive semantic-visual mutual adaption (PSVMA+) network, where sufficient visual elements across granularity levels can be gathered to remedy the granularity inconsistency. PSVMA+ explores semantic-visual interactions at different granularity levels, enabling awareness of multi-granularity in both visual and semantic elements. At each granularity level, the dual semantic-visual transformer module (DSVTM) recasts the sharing attributes into instance-centric attributes and aggregates the semantic-related visual regions, thereby learning unambiguous visual features to accommodate various instances. Given the diverse contributions of different granularities, PSVMA+ employs selective cross-granularity learning to leverage knowledge from reliable granularities and adaptively fuses multi-granularity features for comprehensive representations. Experimental results demonstrate that PSVMA+ consistently outperforms state-of-the-art methods.
引用
收藏
页码:51 / 66
页数:16
相关论文
共 50 条
  • [1] Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning
    Liu, Man
    Li, Feng
    Zhang, Chunjie
    Wei, Yunchao
    Bai, Huihui
    Zhao, Yao
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15337 - 15346
  • [2] Semantic-visual shared knowledge graph for zero-shot learning
    Yu, Beibei
    Xie, Cheng
    Tang, Peng
    Li, Bin
    PEERJ COMPUTER SCIENCE, 2023, 9
  • [3] HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning
    Chen, Shiming
    Xie, Guo-Sen
    Liu, Yang
    Peng, Qinmu
    Sun, Baigui
    Li, Hao
    You, Xinge
    Shao, Ling
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [4] Semantic-Visual Combination Propagation Network for Zero-Shot Learning
    Song, Wenli
    Zhang, Lei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (04) : 2341 - 2345
  • [5] Semantic-visual shared knowledge graph for zero-shot learning
    Yu B.
    Xie C.
    Tang P.
    Li B.
    PeerJ Computer Science, 2023, 9
  • [6] Learning cross-domain semantic-visual relationships for transductive zero-shot learning
    Lv, Fengmao
    Zhang, Jianyang
    Yang, Guowu
    Feng, Lei
    Yu, Yufeng
    Duan, Lixin
    PATTERN RECOGNITION, 2023, 141
  • [7] Multi-granularity contrastive zero-shot learning model based on attribute decomposition
    Wang, Yuanlong
    Wang, Jing
    Fan, Yue
    Chai, Qinghua
    Zhang, Hu
    Li, Xiaoli
    Li, Ru
    INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (01)
  • [8] Semantic-Visual Consistency Constraint Network for Zero-Shot Image Semantic Segmentation
    Chen, Qiong
    Feng, Yuan
    Li, Zhiqun
    Yang, Yong
    Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2024, 52 (10): : 41 - 50
  • [9] Zero-Shot Point Cloud Segmentation by Semantic-Visual Aware Synthesis
    Yang, Yuwei
    Hayat, Munawar
    Jin, Zhao
    Zhu, Hongyuan
    Lei, Yinjie
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 11552 - 11562
  • [10] Visual-semantic consistency matching network for generalized zero-shot learning
    Zhang, Zhenqi
    Cao, Wenming
    NEUROCOMPUTING, 2023, 536 : 30 - 39