IMAGE CAPTIONING WITH ATTRIBUTE REFINEMENT

被引：0

作者：

Huang, Yiqing ^{[1
]}

Li, Cong ^{[1
]}

Li, Tianpeng ^{[1
]}

Wan, Weitao ^{[1
]}

Chen, Jiansheng ^{[1
]}

机构：

[1] Tsinghua Univ, Beijing 100084, Peoples R China

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2019年

基金：

中国国家自然科学基金;

关键词：

Image captioning; attribute recognition; Semantic attention; Deep Neural Network; Conditional Random Field;

D O I：

10.1109/icip.2019.8803108

中图分类号：

TB8 [摄影技术];

学科分类号：

0804 ;

摘要：

Semantic attention has long been adopted to image captioning models to enhance the image captioning performances. The models pre-trained for attribute recognition are utilized to generate image attributes in image captioning. Generally, these models are not jointly trained with image captioning models. In this paper, we propose attribute refinement network, which incorporates attribute recognition with image captioning to boost the performance on both tasks. We model the correlation between attributes with the semantic information from image captioning to improve the recognition accuracy. In turn, better attribute recognition results effectively enhance image captioning performance. Our model achieves CIDEr-D/SPICE scores of 115.1 and 20.9 respectively on the MS COCO test set, comprehensively yields improvement over all compared methods.

引用

页码：1820 / 1824

页数：5

共 50 条

[41] Boost image captioning with knowledge reasoning
Huang, Feicheng
Li, Zhixin
Wei, Haiyang
Zhang, Canlong
Ma, Huifang
MACHINE LEARNING, 2020, 109 (12) : 2313 - 2332
[42] Visual Cluster Grounding for Image Captioning
Jiang, Wenhui
Zhu, Minwei
Fang, Yuming
Shi, Guangming
Zhao, Xiaowei
Liu, Yang
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 3920 - 3934
[43] IMAGE CAPTIONING WITH TWO CASCADED AGENTS
Huang, Lun
Wang, Wenmin
Wang, Gang
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 4110 - 4114
[44] A visual persistence model for image captioning
Wang, Yiyu
Xu, Jungang
Sun, Yingfei
NEUROCOMPUTING, 2022, 468 : 48 - 59
[45] Memory positional encoding for image captioning
Yang, Xiaobao
He, Shuai
Zhang, Jie
Ma, Sugang
Hou, Zhiqiang
Sun, Wei
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2025, 130
[46] Image Captioning with Generative Adversarial Network
Amirian, Soheyla
Rasheed, Khaled
Taha, Thiab R.
Arabnia, Hamid R.
2019 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2019), 2019, : 272 - 275
[47] Image captioning based on dependency syntax
Bi J.
Liu M.
Hu H.
Dai J.
Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2021, 47 (03): : 431 - 440
[48] Deconfounded Image Captioning: A Causal Retrospect
Yang, Xu
Zhang, Hanwang
Cai, Jianfei
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 12996 - 13010
[49] Ontological Approach to Image Captioning Evaluation
D. Shunkevich
N. Iskra
Pattern Recognition and Image Analysis, 2020, 30 : 288 - 294
[50] Semi-Autoregressive Image Captioning
Yan, Xu
Fei, Zhengcong
Li, Zekang
Wang, Shuhui
Huang, Qingming
Tian, Qi
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2708 - 2716

← 1 2 3 4 5 →