Audio-Visual Generalized Zero-Shot Learning Based on Variational Information Bottleneck

被引：0

作者：

Li, Yapeng

Luo, Yong ^{[1
]}

Du, Bo ^{[1
]}

机构：

[1] Wuhan Univ, Inst Artificial Intelligence, Sch Comp Sci, Natl Engn Res Ctr Multimedia Software, Wuhan 430072, Peoples R China

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME | 2023年

基金：

中国国家自然科学基金;

关键词：

Audio-visual; generalized zero-shot learning; information bottleneck; multi-modality fusion;

D O I：

10.1109/ICME55011.2023.00084

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Audio-visual generalized zero-shot learning (GZSL) aims to train a model on seen classes for classifying data samples from both seen classes and unseen classes. Due to the absence of unseen training samples, the model tends to misclassify unseen class samples into seen classes. To mitigate this problem, in this paper, we propose a method based on variational information bottleneck for audio-visual GZSL. Specifically, we model the joint representations as a product-of-experts over marginal representations to integrate the information of audio and visual. Besides, we introduce variational information bottleneck to the learning of audio-visual joint representations and marginal representations of audio, visual, and text label modalities. This helps our model reduce the negative impact of information that cannot be generalized to unseen classes. Experimental results conducted on the UCF-GZSL, VGGSound-GZSL, and ActivityNet-GZSL benchmarks demonstrate the effectiveness and superiority of the proposed model for audio-visual GZSL.

引用

页码：450 / 455

页数：6

共 50 条

[21] Semantic Contrastive Embedding for Generalized Zero-Shot Learning
Han, Zongyan
Fu, Zhenyong
Chen, Shuo
Yang, Jian
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (11) : 2606 - 2622
[22] Dissimilarity Representation Learning for Generalized Zero-Shot Recognition
Yang, Gang
Liu, Jinlu
Xu, Jieping
Li, Xirong
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 2032 - 2039
[23] Unbiased feature generating for generalized zero-shot learning
Niu, Chang
Shang, Junyuan
Huang, Junchu
Yang, Junmei
Song, Yuting
Zhou, Zhiheng
Zhou, Guoxu
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 89
[24] GENERALIZED ZERO-SHOT LEARNING USING MULTIMODAL VARIATIONAL AUTO-ENCODER WITH SEMANTIC CONCEPTS
Bendre, Nihar
Desai, Kevin
Najafirad, Peyman
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1284 - 1288
[25] Semantic Contrastive Embedding for Generalized Zero-Shot Learning
Zongyan Han
Zhenyong Fu
Shuo Chen
Jian Yang
International Journal of Computer Vision, 2022, 130 : 2606 - 2622
[26] Learning Multiple Criteria Calibration for Generalized Zero-shot Learning
Lu, Ziqian
Lu, Zhe-Ming
Yu, Yunlong
He, Zewei
Luo, Hao
Zheng, Yangming
KNOWLEDGE-BASED SYSTEMS, 2024, 300
[27] A Dual Discriminator Method for Generalized Zero-Shot Learning
Wei, Tianshu
Huang, Jinjie
CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 79 (01): : 1599 - 1612
[28] Joint Visual and Semantic Optimization for zero-shot learning
Wu, Hanrui
Yan, Yuguang
Chen, Sentao
Huang, Xiangkang
Wu, Qingyao
Ng, Michael K.
KNOWLEDGE-BASED SYSTEMS, 2021, 215 (215)
[29] Indirect visual–semantic alignment for generalized zero-shot recognition
Yan-He Chen
Mei-Chen Yeh
Multimedia Systems, 2024, 30
[30] Prototypical Model with Information-theoretic Loss Functions for Generalized Zero-Shot Learning
Ji, Chunlin
Xiong, Zhan
Zhang, Meiying
Yang, Huiwen
Chen, Feng
Shen, Hanchun
ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222

← 1 2 3 4 5 →