Style-Aware Contrastive Learning for Multi-Style Image Captioning

被引:0
|
作者
Zhou, Yucheng [1 ]
Long, Guodong [1 ]
机构
[1] Univ Technol Sydney, Australian AI Inst, Sch Comp Sci, FEIT, Sydney, NSW, Australia
来源
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing multi-style image captioning methods show promising results in generating a caption with accurate visual content and desired linguistic style. However, existing methods overlook the relationship between linguistic style and visual content. To overcome this drawback, we propose style-aware contrastive learning for multi-style image captioning. First, we present a style-aware visual encoder with contrastive learning to mine potential visual content relevant to style. Moreover, we propose a style-aware triplet contrast objective to distinguish whether the image, style and caption matched. To provide positive and negative samples for contrastive learning, we present three retrieval schemes: object-based retrieval, RoI-based retrieval and triplet-based retrieval, and design a dynamic trade-off function to calculate retrieval scores. Experimental results demonstrate that our approach achieves state-of-the-art performance. In addition, we conduct an extensive analysis to verify the effectiveness of our method.
引用
收藏
页码:2257 / 2267
页数:11
相关论文
共 50 条
  • [11] Style Mixer: Semantic-aware Multi-Style Transfer Network
    Huang, Zixuan
    Zhang, Jinghuai
    Liao, Jing
    COMPUTER GRAPHICS FORUM, 2019, 38 (07) : 469 - 480
  • [12] Disentangling Structure and Aesthetics for Style-aware Image Completion
    Gilbert, Andrew
    Collomosse, John
    Jin, Hailin
    Price, Brian
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1848 - 1856
  • [13] Image Style Transfer via Multi-Style Geometry Warping
    Alexandru, Ioana
    Nicula, Constantin
    Prodan, Cristian
    Rotaru, Razvan-Paul
    Tarba, Nicolae
    Boiangiu, Costin-Anton
    APPLIED SCIENCES-BASEL, 2022, 12 (12):
  • [14] Fashion Style-Aware Embeddings for Clothing Image Retrieval
    Naka, Rino
    Katsurai, Marie
    Yanagi, Keisuke
    Goto, Ryosuke
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 49 - 53
  • [15] Style-Aware Image Recommendation for Social Media Marketing
    Zhang, Yiwei
    Yamasaki, Toshihiko
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3106 - 3114
  • [16] Multi-style image generation based on semantic image
    Yu, Yue
    Li, Ding
    Li, Benyuan
    Li, Nengli
    VISUAL COMPUTER, 2024, 40 (05): : 3411 - 3426
  • [17] Flow style-aware network for arbitrary style transfer
    Hu, Zhenshan
    Ge, Bin
    Xia, Chenxing
    Wu, Wenyan
    Zhou, Guangao
    Wang, Baotong
    COMPUTERS & GRAPHICS-UK, 2024, 124
  • [18] Multi-style spatial attention module for cortical cataract classification in AS-OCT image with supervised contrastive learning
    Xiao, Zunjie
    Zhang, Xiaoqing
    Zheng, Bofang
    Guo, Yitong
    Higashita, Risa
    Liu, Jiang
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 244
  • [19] Multi-style image generation based on semantic image
    Yue Yu
    Ding Li
    Benyuan Li
    Nengli Li
    The Visual Computer, 2024, 40 : 3411 - 3426
  • [20] Exploring Style-Robust Scene Text Detection via Style-Aware Learning
    Cai, Yuanqiang
    Zhou, Fenfen
    Yin, Ronghui
    ELECTRONICS, 2024, 13 (02)