Knowledge Guided Transformer Network for Compositional Zero-Shot Learning

被引：0

作者：

Panda, Aditya ^{[1
]}

Prasad, Dipti ^{[1
]}

机构：

[1] Indian Stat Inst, Kolkata, India

来源：

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS | 2024年 / 20卷 / 11期

关键词：

Compositionality; Compositional zero-shot learning; state-object composi- tion; partial association;

D O I：

10.1145/3687129

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Compositional Zero-shot Learning (CZSL) attempts to recognise images of new compositions of states and objects when images of only a subset of state-object compositions are available as training data. An example of CZSL is to recognise images of peeled apple by a model when it is trained using images of peeled orange, ripe apple and ripe orange. There are two major challenges in solving CZSL. First, the visual features of a state vary depending on the context of a state-object composition. For example state like ripe produces distinct visual properties in the compositions ripe orange and ripe banana. Hence, understanding the context dependency of state features is a necessary requirement to solve CZSL. Second, the extent of association between the features of a state and an object varies significantly in different images of same composition. For example, in different images of peeled oranges, the oranges may be peeled to different extents. As a consequence, the visual features of images of the class peeled orange may vary. Hence, there exists a significant amount of intra-class variability among the visual features of different images of a composition. Existing approaches merely look for the existence or absence of features of particular state or object in a composition. Our approach not only looks for the existence of a particular state features or object features but also the extent of association of state features and object features to better tackle the intra-class variability in visual features of compositional images. The proposed architecture is constructed using a novel Knowledge Guided Transformer. The transformer-based framework is utilised for processing larger context dependency between the state and object. Extensive experiments on C-GQA, MIT-States and UT-Zappos50k datasets demonstrate the superiority of the proposed approach in comparison with the state-of-the-art in both open-world and closed-world CZSL settings.

引用

页数：25

共 23 条

[1] Learning Invariant Visual Representations for Compositional Zero-Shot Learning
Zhang, Tian
Liang, Kongming
Du, Ruoyi
Sun, Xian
Ma, Zhanyu
Guo, Jun
COMPUTER VISION, ECCV 2022, PT XXIV, 2022, 13684 : 339 - 355
[2] Reference-Limited Compositional Zero-Shot Learning
Huang, Siteng
Wei, Qiyao
Wang, Donglin
PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 443 - 451
[3] Adaptive Fusion Learning for Compositional Zero-Shot Recognition
Min, Lingtong
Fan, Ziman
Wang, Shunzhou
Dou, Feiyang
Li, Xin
Wang, Binglu
IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 1193 - 1204
[4] A Decomposable Causal View of Compositional Zero-Shot Learning
Yang, Muli
Xu, Chenghao
Wu, Aming
Deng, Cheng
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 5892 - 5902
[5] Learning Graph Embeddings for Open World Compositional Zero-Shot Learning
Mancini, Massimiliano
Naeem, Muhammad Ferjad
Xian, Yongqin
Akata, Zeynep
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (03) : 1545 - 1560
[6] Swap-Reconstruction Autoencoder for Compositional Zero-Shot Learning
Guo, Ting
Liang, Jiye
Xie, Guo-Sen
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 438 - 443
[7] 3D Compositional Zero-Shot Learning with DeCompositional Consensus
Naeem, Muhammad Ferjad
Ornek, Evin Pinar
Xian, Yongqin
Van Gool, Luc
Tombari, Federico
COMPUTER VISION - ECCV 2022, PT XXVIII, 2022, 13688 : 713 - 730
[8] PMGNet: Disentanglement and entanglement benefit mutually for compositional zero-shot learning
Liu, Yu
Li, Jianghao
Zhang, Yanyi
Jia, Qi
Wang, Weimin
Pu, Nan
Sebe, Nicu
COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249
[9] Dual-Stream Contrastive Learning for Compositional Zero-Shot Recognition
Yang, Yanhua
Pan, Rui
Li, Xiangyu
Yang, Xu
Deng, Cheng
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 1909 - 1919
[10] Isolating Features of Object and Its State for Compositional Zero-Shot Learning
Panda, Aditya
Santra, Bikash
Mukherjee, Dipti Prasad
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (05): : 1571 - 1583

← 1 2 3 →