Automatic image captioning combining natural language processing and deep neural networks

被引:11
作者
Rinaldi, Antonio M. [1 ]
Russo, Cristiano [1 ]
Tommasino, Cristian [1 ]
机构
[1] Univ Naples Federico II, Dept Elect Engn & Informat Technol, IKNOS LAB Intelligent & Knowledge Syst LUPT, Via Claudio 21, I-80125 Naples, Italy
关键词
Object detection; Image captioning; Deep neural networks; Semantic-instance segmentation;
D O I
10.1016/j.rineng.2023.101107
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
An image contains a lot of information that humans can detect in a very short time. Image captioning aims to detect this information by describing the image content through image and text processing techniques. One of the peculiarities of the proposed approach is the combination of multiple networks to catch as many distinct features as possible from a semantic point of view. In this work, our goal is to prove that a combination strategy of existing methods can efficiently improve the performance in the object detection tasks concerning the performance achieved by each tested individually. This approach involves using different deep neural networks that perform two levels of hierarchical object detection in an image. The results are combined and used by a captioning module that generates image captions through natural language processing techniques. Several experimental results are reported and discussed to show the effectiveness of our framework. The combination strategy has also improved, showing a gain in precision over single models.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Image disambiguation with deep neural networks
    DeGuchy, Omar
    Ho, Alex
    Marcia, Roummel F.
    APPLICATIONS OF MACHINE LEARNING, 2019, 11139
  • [22] Exploiting deep representations for natural language processing
    Zi-Yi Dou
    Xing Wang
    Shuming Shi
    Zhaopeng Tu
    NEUROCOMPUTING, 2020, 386 (386) : 1 - 7
  • [23] Automatic Question Tagging with Deep Neural Networks
    Sun, Bo
    Zhu, Yunzong
    Xiao, Yongkang
    Xiao, Rong
    Wei, Yungang
    IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, 2019, 12 (01): : 29 - 43
  • [24] Automatic phoneme recognition by deep neural networks
    Pereira, Bianca Valeria L.
    de Carvalho, Mateus B. F.
    Alves, Pedro Augusto A. da S. de A. Nava
    Ribeiro, Paulo Rogerio de A.
    de Oliveira, Alexandre Cesar M.
    de Almeida Neto, Areolino
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (11) : 16654 - 16678
  • [25] Recurrent Neural Networks with Mixed Hierarchical Structures and EM Algorithm for Natural Language Processing
    Luo, Zhaoxin
    Zhu, Michael
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6104 - 6113
  • [26] Deep neural networks for automatic speech processing: a survey from large corpora to limited data
    Vincent Roger
    Jérôme Farinas
    Julien Pinquier
    EURASIP Journal on Audio, Speech, and Music Processing, 2022
  • [27] Ensemble Learning on Deep Neural Networks for Image Caption Generation
    Katpally, Harshitha
    Bansal, Ajay
    2020 IEEE 14TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2020), 2020, : 61 - 68
  • [28] Deep neural networks for automatic speech processing: a survey from large corpora to limited data
    Roger, Vincent
    Farinas, Jerome
    Pinquier, Julien
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2022, 2022 (01)
  • [29] Imbalanced prediction of emergency department admission using natural language processing and deep neural network
    Chen, Tzu-Li
    Chen, James C.
    Chang, Wen -Han
    Tsai, Weide
    Shih, Mei-Chuan
    Nabila, Achmad Wildan
    JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 133
  • [30] Auroral Image Classification With Deep Neural Networks
    Kvammen, Andreas
    Wickstrom, Kristoffer
    McKay, Derek
    Partamies, Noora
    JOURNAL OF GEOPHYSICAL RESEARCH-SPACE PHYSICS, 2020, 125 (10)