Automated Image Captioning with Multi-layer Gated Recurrent Unit

被引:0
作者
Moral, Ozge Taylan [1 ]
Kilic, Volkan [1 ]
Onan, Aytug [2 ]
Wang, Wenwu [3 ]
机构
[1] Izmir Katip Celebi Univ, Elect & Elect Engn Grad Program, Izmir, Turkey
[2] Izmir Katip Celebi Univ, Dept Comp Engn, Izmir, Turkey
[3] Univ Surrey, Ctr Vis Speech & Signal Proc CVSSP, Guildford, Surrey, England
来源
2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022) | 2022年
关键词
convolutional neural network; gated recurrent unit; image captioning; recurrent neural network;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Describing the semantic content of an image via natural language, known as image captioning, has recently attracted substantial interest in computer vision and language processing communities. Current image captioning approaches are mainly based on an encoder-decoder framework in which visual information is extracted by an image encoder and captions are generated by a text decoder, using convolution neural networks (CNN) and recurrent neural networks (RNN), respectively. Although this framework is promising for image captioning, it has limitations in utilizing the encoded visual information for generating grammatically and semantically correct captions in the RNN decoder. More specifically, the RNN decoder is ineffective in using the contextual information from the encoded data due to its limited ability in capturing long-term complex dependencies. Inspired by the advantage of gated recurrent unit (GRU), in this paper, we propose an extension of conventional RNN by introducing a multi-layer GRU that modulates the most relevant information inside the unit to enhance the semantic coherence of captions. Experimental results on the MSCOCO dataset show the superiority of our proposed approach over the state-of-the-art approaches in several performance metrics.
引用
收藏
页码:1160 / 1164
页数:5
相关论文
共 50 条
  • [21] Evolutionary recurrent neural network for image captioning
    Wang, Hanzhang
    Wang, Hanli
    Xu, Kaisheng
    NEUROCOMPUTING, 2020, 401 : 249 - 256
  • [22] Temporal Convolutional and Recurrent Networks for Image Captioning
    Iskra, Natalia
    Iskra, Vitaly
    PATTERN RECOGNITION AND INFORMATION PROCESSING, PRIP 2019, 2019, 1055 : 254 - 266
  • [23] Recurrent Neural Network for Content Based Image Retrieval Using Image Captioning Model
    Sindu, S.
    Kousalya, R.
    COMPUTATIONAL VISION AND BIO-INSPIRED COMPUTING, 2020, 1108 : 1067 - 1077
  • [24] Integrating Convolutional Neural Network and Gated Recurrent Unit for Hyperspectral Image Spectral-Spatial Classification
    Zhou, Feng
    Hang, Renlong
    Liu, Qingshan
    Yuan, Xiaotong
    PATTERN RECOGNITION AND COMPUTER VISION (PRCV 2018), PT IV, 2018, 11259 : 409 - 420
  • [25] Recognition of meal information using recurrent neural network and gated recurrent unit
    Zhang, Liyang
    Suzuki, Hiroyuki
    Koyama, Akio
    INTERNET OF THINGS, 2021, 13
  • [26] Residual Based Gated Recurrent Unit
    Zhang Z.-H.
    Dong F.-M.
    Hu F.
    Wu Y.-R.
    Sun S.-F.
    Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (12): : 3067 - 3074
  • [27] Image Captioning for Automated Grading and Understanding of Ulcerative Colitis
    Helena Valencia, Flor
    Flores-Araiza, Daniel
    Cerda, Obed
    Subramanian, Venkataraman
    de lange, Thomas
    Ochoa-Ruiz, Gilberto
    Ali, Sharib
    CANCER PREVENTION THROUGH EARLY DETECTION, CAPTION 2023, 2023, 14295 : 40 - 51
  • [28] A homotopy gated recurrent unit for predicting high dimensional hyperchaos
    Li, Yuting
    Li, Yong
    COMMUNICATIONS IN NONLINEAR SCIENCE AND NUMERICAL SIMULATION, 2022, 115
  • [29] A gated recurrent unit based robust voice activity detector
    Il Han
    Chol-Nam Om
    Un-Il Kim
    Multimedia Tools and Applications, 2024, 83 : 41939 - 41949
  • [30] A gated recurrent unit based robust voice activity detector
    Han, Il
    Om, Chol-Nam
    Kim, Un-Il
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (14) : 41939 - 41949