Audio-Visual Generalized Zero-Shot Learning Based on Variational Information Bottleneck

被引:0
作者
Li, Yapeng
Luo, Yong [1 ]
Du, Bo [1 ]
机构
[1] Wuhan Univ, Inst Artificial Intelligence, Sch Comp Sci, Natl Engn Res Ctr Multimedia Software, Wuhan 430072, Peoples R China
来源
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME | 2023年
基金
中国国家自然科学基金;
关键词
Audio-visual; generalized zero-shot learning; information bottleneck; multi-modality fusion;
D O I
10.1109/ICME55011.2023.00084
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Audio-visual generalized zero-shot learning (GZSL) aims to train a model on seen classes for classifying data samples from both seen classes and unseen classes. Due to the absence of unseen training samples, the model tends to misclassify unseen class samples into seen classes. To mitigate this problem, in this paper, we propose a method based on variational information bottleneck for audio-visual GZSL. Specifically, we model the joint representations as a product-of-experts over marginal representations to integrate the information of audio and visual. Besides, we introduce variational information bottleneck to the learning of audio-visual joint representations and marginal representations of audio, visual, and text label modalities. This helps our model reduce the negative impact of information that cannot be generalized to unseen classes. Experimental results conducted on the UCF-GZSL, VGGSound-GZSL, and ActivityNet-GZSL benchmarks demonstrate the effectiveness and superiority of the proposed model for audio-visual GZSL.
引用
收藏
页码:450 / 455
页数:6
相关论文
共 50 条
  • [41] Multi-Dimensional Information Alignment in Different Modalities for Generalized Zero-Shot and Few-Shot Learning
    Cai, Jiyan
    Wu, Libing
    Wu, Dan
    Li, Jianxin
    Wu, Xianfeng
    INFORMATION, 2023, 14 (03)
  • [42] GENERALIZED ZERO-SHOT LEARNING USING CONDITIONAL WASSERSTEIN AUTOENCODER
    Kim, Junhan
    Shim, Byonghyo
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3413 - 3417
  • [43] Bidirectional Mapping Coupled GAN for Generalized Zero-Shot Learning
    Shermin, Tasfia
    Teng, Shyh Wei
    Sohel, Ferdous
    Murshed, Manzur
    Lu, Guojun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 721 - 733
  • [44] Dual Prototype Contrastive Network for Generalized Zero-Shot Learning
    Jiang, Huajie
    Li, Zhengxian
    Hu, Yongli
    Yin, Baocai
    Yang, Jian
    van den Hengel, Anton
    Yang, Ming-Hsuan
    Qi, Yuankai
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (02) : 1111 - 1122
  • [45] Cooperative Coupled Generative Networks for Generalized Zero-Shot Learning
    Sun, Liang
    Song, Junjie
    Wang, Ye
    Li, Baoyu
    IEEE ACCESS, 2020, 8 : 119287 - 119299
  • [46] ROBUST BIDIRECTIONAL GENERATIVE NETWORK FOR GENERALIZED ZERO-SHOT LEARNING
    Xing, Yun
    Huang, Sheng
    Huangfu, Luwen
    Chen, Feiyu
    Ge, Yongxin
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
  • [47] Mitigating Generation Shi!s for Generalized Zero-Shot Learning
    Chen, Zhi
    Luo, Yadan
    Wang, Sen
    Qiu, Ruihong
    Li, Jingjing
    Huang, Zi
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 844 - 852
  • [48] Content-Attribute Disentanglement for Generalized Zero-Shot Learning
    An, Yoojin
    Kim, Sangyeon
    Liang, Yuxuan
    Zimmermann, Roger
    Kim, Dongho
    Kim, Jihie
    IEEE ACCESS, 2022, 10 : 58320 - 58331
  • [49] Inference guided feature generation for generalized zero-shot learning
    Han, Zongyan
    Fu, Zhenyong
    Li, Guangyu
    Yang, Jian
    NEUROCOMPUTING, 2021, 430 : 150 - 158
  • [50] Generalized zero-shot domain adaptation via coupled conditional variational autoencoders
    Wang, Qian
    Breckon, Toby P.
    NEURAL NETWORKS, 2023, 163 : 40 - 52