Audio-Visual Generalized Zero-Shot Learning Based on Variational Information Bottleneck

被引:0
作者
Li, Yapeng
Luo, Yong [1 ]
Du, Bo [1 ]
机构
[1] Wuhan Univ, Inst Artificial Intelligence, Sch Comp Sci, Natl Engn Res Ctr Multimedia Software, Wuhan 430072, Peoples R China
来源
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME | 2023年
基金
中国国家自然科学基金;
关键词
Audio-visual; generalized zero-shot learning; information bottleneck; multi-modality fusion;
D O I
10.1109/ICME55011.2023.00084
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Audio-visual generalized zero-shot learning (GZSL) aims to train a model on seen classes for classifying data samples from both seen classes and unseen classes. Due to the absence of unseen training samples, the model tends to misclassify unseen class samples into seen classes. To mitigate this problem, in this paper, we propose a method based on variational information bottleneck for audio-visual GZSL. Specifically, we model the joint representations as a product-of-experts over marginal representations to integrate the information of audio and visual. Besides, we introduce variational information bottleneck to the learning of audio-visual joint representations and marginal representations of audio, visual, and text label modalities. This helps our model reduce the negative impact of information that cannot be generalized to unseen classes. Experimental results conducted on the UCF-GZSL, VGGSound-GZSL, and ActivityNet-GZSL benchmarks demonstrate the effectiveness and superiority of the proposed model for audio-visual GZSL.
引用
收藏
页码:450 / 455
页数:6
相关论文
共 50 条
  • [21] Semantic Contrastive Embedding for Generalized Zero-Shot Learning
    Han, Zongyan
    Fu, Zhenyong
    Chen, Shuo
    Yang, Jian
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (11) : 2606 - 2622
  • [22] Dissimilarity Representation Learning for Generalized Zero-Shot Recognition
    Yang, Gang
    Liu, Jinlu
    Xu, Jieping
    Li, Xirong
    PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 2032 - 2039
  • [23] Unbiased feature generating for generalized zero-shot learning
    Niu, Chang
    Shang, Junyuan
    Huang, Junchu
    Yang, Junmei
    Song, Yuting
    Zhou, Zhiheng
    Zhou, Guoxu
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 89
  • [24] GENERALIZED ZERO-SHOT LEARNING USING MULTIMODAL VARIATIONAL AUTO-ENCODER WITH SEMANTIC CONCEPTS
    Bendre, Nihar
    Desai, Kevin
    Najafirad, Peyman
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1284 - 1288
  • [25] Semantic Contrastive Embedding for Generalized Zero-Shot Learning
    Zongyan Han
    Zhenyong Fu
    Shuo Chen
    Jian Yang
    International Journal of Computer Vision, 2022, 130 : 2606 - 2622
  • [26] Learning Multiple Criteria Calibration for Generalized Zero-shot Learning
    Lu, Ziqian
    Lu, Zhe-Ming
    Yu, Yunlong
    He, Zewei
    Luo, Hao
    Zheng, Yangming
    KNOWLEDGE-BASED SYSTEMS, 2024, 300
  • [27] A Dual Discriminator Method for Generalized Zero-Shot Learning
    Wei, Tianshu
    Huang, Jinjie
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 79 (01): : 1599 - 1612
  • [28] Joint Visual and Semantic Optimization for zero-shot learning
    Wu, Hanrui
    Yan, Yuguang
    Chen, Sentao
    Huang, Xiangkang
    Wu, Qingyao
    Ng, Michael K.
    KNOWLEDGE-BASED SYSTEMS, 2021, 215 (215)
  • [29] Indirect visual–semantic alignment for generalized zero-shot recognition
    Yan-He Chen
    Mei-Chen Yeh
    Multimedia Systems, 2024, 30
  • [30] Prototypical Model with Information-theoretic Loss Functions for Generalized Zero-Shot Learning
    Ji, Chunlin
    Xiong, Zhan
    Zhang, Meiying
    Yang, Huiwen
    Chen, Feng
    Shen, Hanchun
    ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222