Multi-Attention Based Visual-Semantic Interaction for Few-Shot Learning

被引:0
|
作者
Zhao, Peng [1 ]
Wang, Yin [1 ]
Wang, Wei [2 ]
Mu, Jie [3 ]
Liu, Huiting [1 ]
Wang, Cong [2 ,4 ]
Cao, Xiaochun [2 ]
机构
[1] Anhui Univ, Sch Comp Sci & Technol, Hefei, Peoples R China
[2] Shenzhen Campus Sun Yat Sen Univ, Sch Cyber Sci & Technol, Guangzhou, Peoples R China
[3] Dongbei Univ Finance & Econ, Sch Data Sci & Artificial Intelligence, Dalian, Peoples R China
[4] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-Shot Learning (FSL) aims to train a model that can generalize to recognize new classes, with each new class having only very limited training samples. Since extracting discriminative features for new classes with few samples is challenging, existing FSL methods leverage visual and semantic prior knowledge to guide discriminative feature learning. However, for meta-learning purposes, the semantic knowledge of the query set is unavailable, so their features lack discriminability. To address this problem, we propose a novel Multi-Attention based Visual-Semantic Interaction (MAVSI) approach for FSL. Specifically, we utilize spatial and channel attention mechanisms to effectively select discriminative visual features for the support set based on its ground-truth semantics while using all the support set semantics for each query set sample. Then, a relation module with class prototypes of the support set is employed to supervise and select discriminative visual features for the query set. To further enhance the discriminability of the support set, we introduce a visual-semantic contrastive learning module to promote the similarity between visual features and their corresponding semantic features. Extensive experiments on four benchmark datasets demonstrate that our proposed MAVSI could outperform existing state-of-the-art FSL methods.
引用
收藏
页码:1753 / 1761
页数:9
相关论文
共 50 条
  • [41] Semantic-based Selection, Synthesis, and Supervision for Few-shot Learning
    Lu, Jinda
    Wang, Shuo
    Zhang, Xinyu
    Hao, Yanbin
    He, Xiangnan
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3569 - 3578
  • [42] Multi-semantic hypergraph neural network for effective few-shot learning
    Chen, Hao
    Li, LInyan
    Hu, Fuyuan
    Lyu, Fan
    Zhao, Liuqing
    Huang, Kaizhu
    Feng, Wei
    Xia, Zhenping
    PATTERN RECOGNITION, 2023, 142
  • [43] Towards Cross-Granularity Few-Shot Learning: Coarse-to-Fine Pseudo-Labeling with Visual-Semantic Meta-Embedding
    Yang, Jinhai
    Yang, Hua
    Chen, Lin
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3005 - 3014
  • [44] PROTOTYPE QUEUE LEARNING FOR MULTI-CLASS FEW-SHOT SEMANTIC SEGMENTATION
    Wang, Zichao
    Jiang, Zhiyu
    Yuan, Yuan
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1721 - 1725
  • [45] Attention-Based Multi-View Feature Collaboration for Decoupled Few-Shot Learning
    Shao, Shuai
    Xing, Lei
    Wang, Yanjiang
    Liu, Baodi
    Liu, Weifeng
    Zhou, Yicong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (05) : 2357 - 2369
  • [46] Multi-view Interaction Learning for Few-Shot Relation Classification
    Han, Yi
    Qiao, Linbo
    Zheng, Jianming
    Kan, Zhigang
    Gao, Yifu
    Feng, Linhui
    Tang, Yu
    Zhai, Qi
    Li, Dongsheng
    Liao, Xiangke
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 649 - 658
  • [47] Learning to Compare Relation: Semantic Alignment for Few-Shot Learning
    Cao, Congqi
    Zhang, Yanning
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 1462 - 1474
  • [48] Transductive Visual-Semantic Embedding for Zero-shot Learning
    Xu, Xing
    Shen, Fumin
    Yang, Yang
    Shao, Jie
    Huang, Zi
    PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR'17), 2017, : 41 - 49
  • [49] Few-Shot Novel Concept Learning for Semantic Parsing
    Dan, Soham
    Bastani, Osbert
    Roth, Dan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 2064 - 2075
  • [50] Multi-modal Few-shot Image Recognition with enhanced semantic and visual integration
    Dong, Chunru
    Wang, Lizhen
    Zhang, Feng
    Hua, Qiang
    IMAGE AND VISION COMPUTING, 2025, 157