Style-Aware Contrastive Learning for Multi-Style Image Captioning

被引:0
|
作者
Zhou, Yucheng [1 ]
Long, Guodong [1 ]
机构
[1] Univ Technol Sydney, Australian AI Inst, Sch Comp Sci, FEIT, Sydney, NSW, Australia
来源
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing multi-style image captioning methods show promising results in generating a caption with accurate visual content and desired linguistic style. However, existing methods overlook the relationship between linguistic style and visual content. To overcome this drawback, we propose style-aware contrastive learning for multi-style image captioning. First, we present a style-aware visual encoder with contrastive learning to mine potential visual content relevant to style. Moreover, we propose a style-aware triplet contrast objective to distinguish whether the image, style and caption matched. To provide positive and negative samples for contrastive learning, we present three retrieval schemes: object-based retrieval, RoI-based retrieval and triplet-based retrieval, and design a dynamic trade-off function to calculate retrieval scores. Experimental results demonstrate that our approach achieves state-of-the-art performance. In addition, we conduct an extensive analysis to verify the effectiveness of our method.
引用
收藏
页码:2257 / 2267
页数:11
相关论文
共 50 条
  • [21] Domain Generalization for Mammography Detection via Multi-style and Multi-view Contrastive Learning
    Li, Zheren
    Cui, Zhiming
    Wang, Sheng
    Qi, Yuji
    Ouyang, Xi
    Chen, Qitian
    Yang, Yuezhi
    Xue, Zhong
    Shen, Dinggang
    Cheng, Jie-Zhi
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT VII, 2021, 12907 : 98 - 108
  • [22] Style-aware adversarial pairwise ranking for image recommendation systems
    Zhefu Wu
    Song Zhang
    Agyemang Paul
    Luping Fang
    International Journal of Multimedia Information Retrieval, 2023, 12
  • [23] Latent Style: multi-style image transfer via latent style coding and skip connection
    Jingfei Hu
    Guang Wu
    Hua Wang
    Jicong Zhang
    Signal, Image and Video Processing, 2022, 16 : 359 - 368
  • [24] Style-aware adversarial pairwise ranking for image recommendation systems
    Wu, Zhefu
    Zhang, Song
    Paul, Agyemang
    Fang, Luping
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2023, 12 (02)
  • [25] Latent Style: multi-style image transfer via latent style coding and skip connection
    Hu, Jingfei
    Wu, Guang
    Wang, Hua
    Zhang, Jicong
    SIGNAL IMAGE AND VIDEO PROCESSING, 2022, 16 (02) : 359 - 368
  • [26] Style-Aware Normalized Loss for Improving Arbitrary Style Transfer
    Cheng, Jiaxin
    Jaiswal, Ayush
    Wu, Yue
    Natarajan, Pradeep
    Natarajan, Prem
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 134 - 143
  • [27] Learning to Style-Aware Bayesian Personalized Ranking for Visual Recommendation
    He, Ming
    Zhang, Shaozong
    Meng, Qian
    IEEE ACCESS, 2019, 7 : 14198 - 14205
  • [28] Learning Style-Aware Symbolic Music Representations by Adversarial Autoencoders
    Valenti, Andrea
    Carta, Antonio
    Bacciu, Davide
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 1563 - 1570
  • [29] Scellseg: A style-aware deep learning tool for adaptive cell instance segmentation by contrastive fine-tuning
    Xun, Dejin
    Chen, Deheng
    Zhou, Yitian
    Lauschke, Volker M.
    Wang, Rui
    Wang, Yi
    ISCIENCE, 2022, 25 (12)
  • [30] Deep Ranking for Style-Aware Room Recommendations
    Yildiz, Ilkay
    Ataer-Cansizoglu, Esra
    Liu, Hantian
    Golbus, Peter
    Tezcan, Ozan
    Choi, Jae-Woo
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13975 - 13976