Instance-wise multi-view visual fusion for zero-shot learning

被引：0

作者：

Tang, Long ^{[1
,2
]}

Zhao, Jingtao ^{[1
]}

Tian, Yingjie ^{[3
]}

Yao, Changhua ^{[4
]}

Pardalos, Panos M. ^{[5
]}

机构：

[1] Nanjing Univ Informat Sci & Technol, Sch Artificial Intelligence, Nanjing 210044, Peoples R China

[2] Nanjing Univ Informat Sci & Technol, Res Inst Talent Big Data, Nanjing 210044, Peoples R China

[3] Chinese Acad Sci, Res Ctr Fictitious Econ & Data Sci, Beijing 100190, Peoples R China

[4] Nanjing Univ Informat Sci & Technol, Sch Elect & Informat Engn, Nanjing 210044, Peoples R China

[5] Univ Florida, Ctr Appl Optimizat, Dept Ind & Syst Engn, Gainesville, FL 32611 USA

来源：

APPLIED SOFT COMPUTING | 2024年 / 167卷

基金：

中国国家自然科学基金; 中国博士后科学基金;

关键词：

Zero shot learning; Multi-view visual fusion; Consensus principle; Complementary principle; Multi-view manifold regularization;

D O I：

10.1016/j.asoc.2024.112339

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Zero-shot learning (ZSL) has become increasing popular in computer vision due to its ability to recognize categories unobserved in the training data. So far, most existing ZSL approaches adopt visual representations that are either derived from pretrained networks or learned using an end-to-end architecture. However, a single group of visual representations can hardly capture all features hidden in the images, yielding incomplete visual information. In numerous real-life scenarios, multi-view visual representations are often accessible which describe the instances more comprehensively and are potential for better learning performance. In this paper, we introduce an instance-wise multi-view visual fusion (IMVF) for zero-shot learning (ZSL) model. In accordance with the consensus principle, a multi-view visual-semantic mapping is created by minimizing the disparities of seen-class semantic projections on different views. Meanwhile, a straightforward linear constraint is performed on each seen-class instance to adhere to the complementary principle so that the cross-view information exchange is well motivated. In order to mitigate the domain shift problem, the predicted unseen-class semantic projections are further refined through a multi-view manifold alignment under the consensus principle. Our proposed IMVFZSL is compared with the State-of-the-Art ZSL methods on AwA2, CUB and SUN datasets. Exciting experimental results validate the effectiveness of the IMVF mechanism. To the best of our understanding, this is an initial attempt to fuse multi-view visual representations in ZSL, which will stimulate valuable contemplation in this field.

引用

页数：14

共 50 条

[21] Dynamic visual-guided selection for zero-shot learning
Zhou, Yuan
Xiang, Lei
Liu, Fan
Duan, Haoran
Long, Yang
[J]. JOURNAL OF SUPERCOMPUTING, 2024, 80 (03) : 4401 - 4419
[22] Disentangling Semantic-to-Visual Confusion for Zero-Shot Learning
Ye, Zihan
Hu, Fuyuan
Lyu, Fan
Li, Linyan
Huang, Kaizhu
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2828 - 2840
[23] Learning discriminative visual semantic embedding for zero-shot recognition
Xie, Yurui
Song, Tiecheng
Yuan, Jianying
[J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 115
[24] Visual Structure Constraint for Transductive Zero-Shot Learning in the Wild
Ziyu Wan
Dongdong Chen
Jing Liao
[J]. International Journal of Computer Vision, 2021, 129 : 1893 - 1909
[25] Dynamic visual-guided selection for zero-shot learning
Yuan Zhou
Lei Xiang
Fan Liu
Haoran Duan
Yang Long
[J]. The Journal of Supercomputing, 2024, 80 : 4401 - 4419
[26] Adversarial unseen visual feature synthesis for Zero-shot Learning
Zhang, Haofeng
Long, Yang
Liu, Li
Shao, Ling
[J]. NEUROCOMPUTING, 2019, 329 : 12 - 20
[27] Transductive Visual-Semantic Embedding for Zero-shot Learning
Xu, Xing
Shen, Fumin
Yang, Yang
Shao, Jie
Huang, Zi
[J]. PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR'17), 2017, : 41 - 49
[28] Visual Structure Constraint for Transductive Zero-Shot Learning in the Wild
Wan, Ziyu
Chen, Dongdong
Liao, Jing
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (06) : 1893 - 1909
[29] Zero-Shot Transfer Learning Based on Visual and Textual Resemblance
Yang, Gang
Xu, Jieping
[J]. NEURAL INFORMATION PROCESSING (ICONIP 2019), PT III, 2019, 11955 : 353 - 362
[30] Spherical Zero-Shot Learning
Shen, Jiayi
Xiao, Zehao
Zhen, Xiantong
Zhang, Lei
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (02) : 634 - 645

← 1 2 3 4 5 →