Style-Aware Contrastive Learning for Multi-Style Image Captioning

被引:0
|
作者
Zhou, Yucheng [1 ]
Long, Guodong [1 ]
机构
[1] Univ Technol Sydney, Australian AI Inst, Sch Comp Sci, FEIT, Sydney, NSW, Australia
来源
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing multi-style image captioning methods show promising results in generating a caption with accurate visual content and desired linguistic style. However, existing methods overlook the relationship between linguistic style and visual content. To overcome this drawback, we propose style-aware contrastive learning for multi-style image captioning. First, we present a style-aware visual encoder with contrastive learning to mine potential visual content relevant to style. Moreover, we propose a style-aware triplet contrast objective to distinguish whether the image, style and caption matched. To provide positive and negative samples for contrastive learning, we present three retrieval schemes: object-based retrieval, RoI-based retrieval and triplet-based retrieval, and design a dynamic trade-off function to calculate retrieval scores. Experimental results demonstrate that our approach achieves state-of-the-art performance. In addition, we conduct an extensive analysis to verify the effectiveness of our method.
引用
收藏
页码:2257 / 2267
页数:11
相关论文
共 50 条
  • [31] Style-aware Augmented Virtuality Embeddings (SAVE)
    Hoster, Johannes
    Ritter, Dennis
    Hildebrand, Kristian
    2023 IEEE CONFERENCE VIRTUAL REALITY AND 3D USER INTERFACES, VR, 2023, : 163 - 172
  • [32] A Style-Aware Content Loss for Real-Time HD Style Transfer
    Sanakoyeu, Artsiom
    Kotovenko, Dmytro
    Lang, Sabine
    Ommer, Bjoern
    COMPUTER VISION - ECCV 2018, PT VIII, 2018, 11212 : 715 - 731
  • [33] Multi-style image transfer system using conditional cycleGAN
    Tu, Ching-Ting
    Lin, Hwei Jen
    Tsia, Yihjia
    IMAGING SCIENCE JOURNAL, 2021, 69 (1-4): : 1 - 14
  • [34] PROTOTYPE-TO-STYLE: Dialogue Generation With Style-Aware Editing on Retrieval Memory
    Su, Yixuan
    Wang, Yan
    Cai, Deng
    Baker, Simon
    Korhonen, Anna
    Collier, Nigel
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2152 - 2161
  • [35] Free-Lunch for Cross-Domain Few-Shot Learning: Style-Aware Episodic Training with Robust Contrastive Learning
    Zhang, Ji
    Song, Jingkuan
    Gao, Lianli
    Shen, Hengtao
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 2586 - 2594
  • [36] Topic and Style-aware Transformer for Multimodal Emotion Recognition
    Qiu, Shuwen
    Sekhar, Nitesh
    Singhal, Prateek
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 2074 - 2082
  • [37] Pseudo-Supervised Learning for Semantic Multi-Style Transfer
    Kim, Saehun
    Do, Jeonghyeok
    Kim, Munchurl
    IEEE ACCESS, 2021, 9 (09): : 7930 - 7942
  • [38] Multi-Style Migration QR Code
    You, Fucheng
    Lai, Shuren
    Gong, Hechen
    Zhao, Yangze
    3RD ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION SYSTEM AND ARTIFICIAL INTELLIGENCE (ISAI2018), 2018, 1069
  • [39] Fast Video Multi-Style Transfer
    Gao, Wei
    Lie, Yijun
    Yin, Yihang
    Yang, Ming-Hsuan
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 3211 - 3219
  • [40] UNSUPERVISED LEARNING FOR MULTI-STYLE SPEECH SYNTHESIS WITH LIMITED DATA
    Liang, Shuang
    Miao, Chenfeng
    Chen, Minchuan
    Ma, Jun
    Wang, Shaojun
    Xiao, Jing
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6583 - 6587