GNDAN: Graph Navigated Dual Attention Network for Zero-Shot Learning

被引:25
|
作者
Chen, Shiming [1 ]
Hong, Ziming [1 ]
Xie, Guosen [2 ]
Peng, Qinmu [1 ]
You, Xinge [1 ]
Ding, Weiping [3 ]
Shao, Ling [4 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Elect Informat & Commun, Wuhan 430074, Peoples R China
[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing, Peoples R China
[3] Nantong Univ, Sch Informat Sci & Technol, Nantong 226019, Peoples R China
[4] Saudi Data & Artificial Intelligence Author SDAIA, Natl Ctr Artificial Intelligence NCAI, Riyadh, Saudi Arabia
基金
中国国家自然科学基金;
关键词
Semantics; Visualization; Feature extraction; Task analysis; Knowledge transfer; Navigation; Learning systems; Attribute-based region features; graph attention network (GAT); graph neural network (GNN); zero-shot learning (ZSL);
D O I
10.1109/TNNLS.2022.3155602
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Zero-shot learning (ZSL) tackles the unseen class recognition problem by transferring semantic knowledge from seen classes to unseen ones. Typically, to guarantee desirable knowledge transfer, a direct embedding is adopted for associating the visual and semantic domains in ZSL. However, most existing ZSL methods focus on learning the embedding from implicit global features or image regions to the semantic space. Thus, they fail to: 1) exploit the appearance relationship priors between various local regions in a single image, which corresponds to the semantic information and 2) learn cooperative global and local features jointly for discriminative feature representations. In this article, we propose the novel graph navigated dual attention network (GNDAN) for ZSL to address these drawbacks. GNDAN employs a region-guided attention network (RAN) and a region-guided graph attention network (RGAT) to jointly learn a discriminative local embedding and incorporate global context for exploiting explicit global embeddings under the guidance of a graph. Specifically, RAN uses soft spatial attention to discover discriminative regions for generating local embeddings. Meanwhile, RGAT employs an attribute-based attention to obtain attribute-based region features, where each attribute focuses on the most relevant image regions. Motivated by the graph neural network (GNN), which is beneficial for structural relationship representations, RGAT further leverages a graph attention network to exploit the relationships between the attribute-based region features for explicit global embedding representations. Based on the self-calibration mechanism, the joint visual embedding learned is matched with the semantic embedding to form the final prediction. Extensive experiments on three benchmark datasets demonstrate that the proposed GNDAN achieves superior performances to the state-of-the-art methods. Our code and trained models are available at https://github.com/shiming-chen/GNDAN.
引用
收藏
页码:4516 / 4529
页数:14
相关论文
共 50 条
  • [31] Zero-Shot Pill-Prescription Matching With Graph Convolutional Network and Contrastive Learning
    Nguyen, Trung Thanh
    Nguyen, Phi Le
    Kawanishi, Yasutomo
    Komamizu, Takahiro
    Ide, Ichiro
    IEEE ACCESS, 2024, 12 : 55889 - 55904
  • [32] A Dual Discriminator Method for Generalized Zero-Shot Learning
    Wei, Tianshu
    Huang, Jinjie
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 79 (01): : 1599 - 1612
  • [33] Explainable zero-shot learning via attentive graph convolutional network and knowledge graphs
    Geng, Yuxia
    Chen, Jiaoyan
    Ye, Zhiquan
    Yuan, Zonggang
    Zhang, Wei
    Chen, Huajun
    SEMANTIC WEB, 2021, 12 (05) : 741 - 765
  • [34] Distilled Reverse Attention Network for Open-world Compositional Zero-Shot Learning
    Li, Yun
    Liu, Zhe
    Jha, Saurav
    Yao, Lina
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 1782 - 1791
  • [35] Towards Zero-Shot Learning: A Brief Review and an Attention-Based Embedding Network
    Xie, Guo-Sen
    Zhang, Zheng
    Xiong, Huan
    Shao, Ling
    Li, Xuelong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (03) : 1181 - 1197
  • [36] Zero-shot Node Classification with Decomposed Graph Prototype Network
    Wang, Zheng
    Wang, Jialong
    Guo, Yuchen
    Gong, Zhiguo
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 1769 - 1779
  • [37] Learning Graph Embeddings for Open World Compositional Zero-Shot Learning
    Mancini, Massimiliano
    Naeem, Muhammad Ferjad
    Xian, Yongqin
    Akata, Zeynep
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (03) : 1545 - 1560
  • [38] TGG: Transferable Graph Generation for Zero-shot and Few-shot Learning
    Zhang, Chenrui
    Lyu, Xiaoqing
    Tang, Zhi
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1641 - 1649
  • [39] Transductive semantic knowledge graph propagation for zero-shot learning
    Zhang, Hai-gang
    Que, Hao-yi
    Ren, Jin
    Wu, Zheng-guang
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2023, 360 (17): : 13108 - 13125
  • [40] Graph and Autoencoder Based Feature Extraction for Zero-shot Learning
    Liu, Yang
    Xie, Deyan
    Gao, Quanxue
    Han, Jungong
    Wang, Shujian
    Gao, Xinbo
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3038 - 3044