Improving Image Captioning with Conditional Generative Adversarial Nets

被引：0

作者：

Chen, Chen ^{[1
]}

Mu, Shuai ^{[1
]}

Xiao, Wanpeng ^{[1
]}

Ye, Zexiong ^{[1
]}

Wu, Liesi ^{[1
]}

Ju, Qi ^{[1
]}

机构：

[1] Tencent AI Lab, Shenzhen 518000, Peoples R China

来源：

THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2019年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a novel conditional-generativeadversarial-nets-based image captioning framework as an extension of traditional reinforcement-learning (RL)-based encoder-decoder architecture. To deal with the inconsistent evaluation problem among different objective language metrics, we are motivated to design some "discriminator" networks to automatically and progressively determine whether generated caption is human described or machine generated. Two kinds of discriminator architectures (CNN and RNN-based structures) are introduced since each has its own advantages. The proposed algorithm is generic so that it can enhance any existing RL-based image captioning framework and we show that the conventional RL training method is just a special case of our approach. Empirically, we show consistent improvements over all language evaluation metrics for different state-of-the-art image captioning models. In addition, the well-trained discriminators can also be viewed as objective image captioning evaluators.

引用

页码：8142 / 8150

页数：9

共 50 条

[1] Image Captioning Based on Conditional Generative Adversarial Nets
Huang Y.
Bai C.
Li H.
Zhang J.
Chen S.
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2020, 32 (06): : 911 - 918
[2] TRIPLE SEQUENCE GENERATIVE ADVERSARIAL NETS FOR UNSUPERVISED IMAGE CAPTIONING
Zhou, Yucheng
Tao, Wei
Zhang, Wenqiang
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7598 - 7602
[3] IMAGE EDGE DETECTION BASE ON CONDITIONAL GENERATIVE ADVERSARIAL NETS
He, Mingyun
Wu, Yulun
Li, Xiaofang
Liu, Jinyi
Gu, Xiaofeng
2018 15TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2018, : 18 - 21
[4] Image Captioning with Generative Adversarial Network
Amirian, Soheyla
Rasheed, Khaled
Taha, Thiab R.
Arabnia, Hamid R.
2019 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2019), 2019, : 272 - 275
[5] Sequence generative adversarial nets with a conditional discriminator
Yan, Yongfei
Shen, Gehui
Zhang, Song
Huang, Ting
Deng, Zhi-Hong
Yun, Unil
NEUROCOMPUTING, 2021, 429 : 69 - 76
[6] Visible-to-infrared Image Translation Based on an Improved Conditional Generative Adversarial Nets
Ma Decao
Xian Yong
Su Juan
Li Shaopeng
Li Bing
ACTA PHOTONICA SINICA, 2023, 52 (04)
[7] Medical Image Segmentation Using Semi-supervised Conditional Generative Adversarial Nets
Liu S.-P.
Hong J.-M.
Liang J.-P.
Jia X.-P.
Ouyang J.
Yin J.
Ruan Jian Xue Bao/Journal of Software, 2020, 31 (08): : 2588 - 2602
[8] Interactive Dual Generative Adversarial Networks for Image Captioning
Liu, Junhao
Wang, Kai
Xu, Chunpu
Zhao, Zhou
Xu, Ruifeng
Shen, Ying
Yang, Min
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11588 - 11595
[9] Conditional Generative Adversarial Nets Classifier for Spoken Language Identification
Shen, Peng
Lu, Xugang
Li, Sheng
Kawai, Hisashi
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2814 - 2818
[10] A Road Extraction Method Based on Conditional Generative Adversarial Nets
Lu, Chuanwei
Sun, Qun
Zhao, Yunpeng
Sun, Shijie
Ma, Jingzhen
Cheng, Mianmian
Li, Yuanfu
Wuhan Daxue Xuebao (Xinxi Kexue Ban)/Geomatics and Information Science of Wuhan University, 2021, 46 (06): : 807 - 815

← 1 2 3 4 5 →