IMAGE CAPTIONING WITH ATTRIBUTE REFINEMENT

被引:0
|
作者
Huang, Yiqing [1 ]
Li, Cong [1 ]
Li, Tianpeng [1 ]
Wan, Weitao [1 ]
Chen, Jiansheng [1 ]
机构
[1] Tsinghua Univ, Beijing 100084, Peoples R China
来源
2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2019年
基金
中国国家自然科学基金;
关键词
Image captioning; attribute recognition; Semantic attention; Deep Neural Network; Conditional Random Field;
D O I
10.1109/icip.2019.8803108
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
Semantic attention has long been adopted to image captioning models to enhance the image captioning performances. The models pre-trained for attribute recognition are utilized to generate image attributes in image captioning. Generally, these models are not jointly trained with image captioning models. In this paper, we propose attribute refinement network, which incorporates attribute recognition with image captioning to boost the performance on both tasks. We model the correlation between attributes with the semantic information from image captioning to improve the recognition accuracy. In turn, better attribute recognition results effectively enhance image captioning performance. Our model achieves CIDEr-D/SPICE scores of 115.1 and 20.9 respectively on the MS COCO test set, comprehensively yields improvement over all compared methods.
引用
收藏
页码:1820 / 1824
页数:5
相关论文
共 50 条
  • [41] Boost image captioning with knowledge reasoning
    Huang, Feicheng
    Li, Zhixin
    Wei, Haiyang
    Zhang, Canlong
    Ma, Huifang
    MACHINE LEARNING, 2020, 109 (12) : 2313 - 2332
  • [42] Visual Cluster Grounding for Image Captioning
    Jiang, Wenhui
    Zhu, Minwei
    Fang, Yuming
    Shi, Guangming
    Zhao, Xiaowei
    Liu, Yang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 3920 - 3934
  • [43] IMAGE CAPTIONING WITH TWO CASCADED AGENTS
    Huang, Lun
    Wang, Wenmin
    Wang, Gang
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 4110 - 4114
  • [44] A visual persistence model for image captioning
    Wang, Yiyu
    Xu, Jungang
    Sun, Yingfei
    NEUROCOMPUTING, 2022, 468 : 48 - 59
  • [45] Memory positional encoding for image captioning
    Yang, Xiaobao
    He, Shuai
    Zhang, Jie
    Ma, Sugang
    Hou, Zhiqiang
    Sun, Wei
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2025, 130
  • [46] Image Captioning with Generative Adversarial Network
    Amirian, Soheyla
    Rasheed, Khaled
    Taha, Thiab R.
    Arabnia, Hamid R.
    2019 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2019), 2019, : 272 - 275
  • [47] Image captioning based on dependency syntax
    Bi J.
    Liu M.
    Hu H.
    Dai J.
    Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2021, 47 (03): : 431 - 440
  • [48] Deconfounded Image Captioning: A Causal Retrospect
    Yang, Xu
    Zhang, Hanwang
    Cai, Jianfei
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 12996 - 13010
  • [49] Ontological Approach to Image Captioning Evaluation
    D. Shunkevich
    N. Iskra
    Pattern Recognition and Image Analysis, 2020, 30 : 288 - 294
  • [50] Semi-Autoregressive Image Captioning
    Yan, Xu
    Fei, Zhengcong
    Li, Zekang
    Wang, Shuhui
    Huang, Qingming
    Tian, Qi
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2708 - 2716