Automated Image Captioning with Multi-layer Gated Recurrent Unit

被引:0
作者
Moral, Ozge Taylan [1 ]
Kilic, Volkan [1 ]
Onan, Aytug [2 ]
Wang, Wenwu [3 ]
机构
[1] Izmir Katip Celebi Univ, Elect & Elect Engn Grad Program, Izmir, Turkey
[2] Izmir Katip Celebi Univ, Dept Comp Engn, Izmir, Turkey
[3] Univ Surrey, Ctr Vis Speech & Signal Proc CVSSP, Guildford, Surrey, England
来源
2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022) | 2022年
关键词
convolutional neural network; gated recurrent unit; image captioning; recurrent neural network;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Describing the semantic content of an image via natural language, known as image captioning, has recently attracted substantial interest in computer vision and language processing communities. Current image captioning approaches are mainly based on an encoder-decoder framework in which visual information is extracted by an image encoder and captions are generated by a text decoder, using convolution neural networks (CNN) and recurrent neural networks (RNN), respectively. Although this framework is promising for image captioning, it has limitations in utilizing the encoded visual information for generating grammatically and semantically correct captions in the RNN decoder. More specifically, the RNN decoder is ineffective in using the contextual information from the encoded data due to its limited ability in capturing long-term complex dependencies. Inspired by the advantage of gated recurrent unit (GRU), in this paper, we propose an extension of conventional RNN by introducing a multi-layer GRU that modulates the most relevant information inside the unit to enhance the semantic coherence of captions. Experimental results on the MSCOCO dataset show the superiority of our proposed approach over the state-of-the-art approaches in several performance metrics.
引用
收藏
页码:1160 / 1164
页数:5
相关论文
共 50 条
[31]   Stereo Matching Method Based on Gated Recurrent Unit Networks [J].
Du Hongzhi ;
Zhang Teng ;
Sun Yanbiao ;
Yang Linghui ;
Zhu Jigui .
LASER & OPTOELECTRONICS PROGRESS, 2021, 58 (14)
[32]   Flexible image denoising model with multi-layer conditional feature modulation [J].
Du, Jiazhi ;
Qiao, Xin ;
Yan, Zifei ;
Zhang, Hongzhi ;
Zuo, Wangmeng .
PATTERN RECOGNITION, 2024, 152
[33]   An explainable Bayesian gated recurrent unit model for multi-step streamflow forecasting [J].
Tao, Lizhi ;
Nan, Yueming ;
Cui, Zhichao ;
Wang, Lei ;
Yang, Dong .
JOURNAL OF HYDROLOGY-REGIONAL STUDIES, 2025, 57
[34]   Human motion prediction with gated recurrent unit model of multi-dimensional input [J].
Yu, Yue ;
Tian, Niehao ;
Hao, XiangYu ;
Ma, Tao ;
Yang, Chunguang .
APPLIED INTELLIGENCE, 2022, 52 (06) :6769-6781
[35]   Human motion prediction with gated recurrent unit model of multi-dimensional input [J].
Yue Yu ;
Niehao Tian ;
XiangYu Hao ;
Tao Ma ;
Chunguang Yang .
Applied Intelligence, 2022, 52 :6769-6781
[36]   Evaluation of Gated Recurrent Unit in Arabic Diacritization [J].
Moumen, Raj Ae ;
Chiheb, Raddouane ;
Faizi, Rdouan ;
El Afia, Abdellatif .
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (11) :360-364
[37]   Minimal Gated Unit for Recurrent Neural Networks [J].
Guo-Bing Zhou ;
Jianxin Wu ;
Chen-Lin Zhang ;
Zhi-Hua Zhou .
International Journal of Automation and Computing, 2016, (03) :226-234
[38]   Plant Classification Based on Gated Recurrent Unit [J].
Lee, Sue Han ;
Chang, Yang Loong ;
Chan, Chee Seng ;
Alexis, Joly ;
Bonnet, Pierre ;
Goeau, Herve .
EXPERIMENTAL IR MEETS MULTILINGUALITY, MULTIMODALITY, AND INTERACTION (CLEF 2018), 2018, 11018 :169-180
[39]   Sentiment Analysis Based on Gated Recurrent Unit [J].
Santur, Yunus .
2019 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP 2019), 2019,
[40]   Minimal gated unit for recurrent neural networks [J].
Zhou G.-B. ;
Wu J. ;
Zhang C.-L. ;
Zhou Z.-H. .
International Journal of Automation and Computing, 2016, 13 (3) :226-234