PSVMA plus : Exploring Multi-Granularity Semantic-Visual Adaption for Generalized Zero-Shot Learning

被引:0
|
作者
Liu, Man [1 ]
Bai, Huihui [1 ]
Li, Feng [2 ]
Zhang, Chunjie [1 ]
Wei, Yunchao [1 ]
Wang, Meng [2 ]
Chua, Tat-Seng [3 ]
Zhao, Yao [1 ]
机构
[1] Beijing Jiaotong Univ, Inst Informat Sci, Beijing Key Lab Adv Informat Sci & Network Technol, Beijing 100082, Peoples R China
[2] Hefei Univ Technol, Hefei 230002, Peoples R China
[3] Natl Univ Singapore, Singapore 119077, Singapore
基金
北京市自然科学基金; 国家重点研发计划; 中国国家自然科学基金;
关键词
multi-granularity; Zero-shot learning; semantic-visual interactions;
D O I
10.1109/TPAMI.2024.3467229
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generalized zero-shot learning (GZSL) endeavors to identify the unseen categories using knowledge from the seen domain, necessitating the intrinsic interactions between the visual features and attribute semantic features. However, GZSL suffers from insufficient visual-semantic correspondences due to the attribute diversity and instance diversity. Attribute diversity refers to varying semantic granularity in attribute descriptions, ranging from low-level (specific, directly observable) to high-level (abstract, highly generic) characteristics. This diversity challenges the collection of adequate visual cues for attributes under a uni-granularity. Additionally, diverse visual instances corresponding to the same sharing attributes introduce semantic ambiguity, leading to vague visual patterns. To tackle these problems, we propose a multi-granularity progressive semantic-visual mutual adaption (PSVMA+) network, where sufficient visual elements across granularity levels can be gathered to remedy the granularity inconsistency. PSVMA+ explores semantic-visual interactions at different granularity levels, enabling awareness of multi-granularity in both visual and semantic elements. At each granularity level, the dual semantic-visual transformer module (DSVTM) recasts the sharing attributes into instance-centric attributes and aggregates the semantic-related visual regions, thereby learning unambiguous visual features to accommodate various instances. Given the diverse contributions of different granularities, PSVMA+ employs selective cross-granularity learning to leverage knowledge from reliable granularities and adaptively fuses multi-granularity features for comprehensive representations. Experimental results demonstrate that PSVMA+ consistently outperforms state-of-the-art methods.
引用
收藏
页码:51 / 66
页数:16
相关论文
共 50 条
  • [41] Semantic-aware visual attributes learning for zero-shot recognition
    Xie, Yurui
    Song, Tiecheng
    Li, Wei
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 74
  • [42] Agree to Disagree: Exploring Partial Semantic Consistency Against Visual Deviation for Compositional Zero-Shot Learning
    Li, Xiangyu
    Yang, Xu
    Wang, Xi
    Deng, Cheng
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2024, 16 (04) : 1433 - 1444
  • [43] Audio-Visual Generalized Zero-Shot Learning the Easy Way
    Mo, Shentong
    Morgado, Pedro
    COMPUTER VISION - ECCV 2024, PT LXXI, 2025, 15129 : 377 - 395
  • [44] Semantic Diversity Learning for Zero-Shot Multi-label Classification
    Ben-Cohen, Avi
    Zamir, Nadav
    Ben Baruch, Emanuel
    Friedman, Itamar
    Zelnik-Manor, Lihi
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 620 - 630
  • [45] Augmented semantic feature based generative network for generalized zero-shot learning
    Li, Zhiqun
    Chen, Qiong
    Liu, Qingfa
    NEURAL NETWORKS, 2021, 143 : 1 - 11
  • [46] Generative Model with Semantic Embedding and Integrated Classifier for Generalized Zero-Shot Learning
    Pambala, Ayyappa Kumar
    Dutta, Titir
    Biswas, Soma
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1226 - 1235
  • [47] A generalized zero-shot semantic learning model for batch process fault diagnosis
    Liu, Kai
    Zhao, Xiaoqiang
    Mou, Miao
    Hui, Yongyong
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2025, 36 (01)
  • [48] A Semantic Encoding Out-of-Distribution Classifier for Generalized Zero-Shot Learning
    Ding, Jiayu
    Hu, Xiao
    Zhong, Xiaorong
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1395 - 1399
  • [49] Semantic-guided Reinforced Region Embedding for Generalized Zero-Shot Learning
    Ge, Jiannan
    Xie, Hongtao
    Min, Shaobo
    Zhang, Yongdong
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1406 - 1414
  • [50] Scalable Zero-Shot Learning via Binary Visual-Semantic Embeddings
    Shen, Fumin
    Zhou, Xiang
    Yu, Jun
    Yang, Yang
    Liu, Li
    Shen, Heng Tao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (07) : 3662 - 3674