Collaborative Learning Method for Natural Image Captioning

被引:0
|
作者
Wang, Rongzhao [1 ]
Liu, Libo [1 ]
机构
[1] Ningxia Univ, Sch Informat Engn, Yinchuan, Peoples R China
来源
DATA SCIENCE (ICPCSEE 2022), PT I | 2022年 / 1628卷
关键词
Image captioning; Pix2pix inverting; Collaborative learning;
D O I
10.1007/978-981-19-5194-7_19
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose a collaborative learning method to solve the natural image captioning problem. Numerous existing methods use pretrained image classification CNNs to obtain feature representations for image caption generation, which ignores the gap in image feature representations between different computer vision tasks. To address this problem, our method aims to utilize the similarity between image caption and pix-to-pix inverting tasks to ease the feature representation gap. Specifically, our framework consists of two modules: 1) The pix2pix module (P2PM), which has a share learning feature extractor to extract feature representations and a U-net architecture to encode the image to latent code and then decodes them to the original image. 2) The natural language generation module (NLGM) generates descriptions from feature representations extracted by P2PM. Consequently, the feature representations and generated image captions are improved during the collaborative learning process. The experimental results on the MSCOCO 2017 dataset prove the effectiveness of our approach compared to other comparison methods.
引用
收藏
页码:249 / 261
页数:13
相关论文
共 50 条
  • [31] Deep Learning Image Captioning in Construction Management: A Feasibility Study
    Xiao, Bo
    Wang, Yiheng
    Kang, Shih-Chung
    JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT, 2022, 148 (07)
  • [32] A Deep Learning Approach for Nepali Image Captioning and Speech Generation
    Sharma, Sagar
    Chapagain, Samikshya
    Acharya, Sachin
    Panday, Sanjeeb Prasad
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2025, 16 (02) : 1258 - 1264
  • [33] Region-Aware Image Captioning via Interaction Learning
    Liu, An-An
    Zhai, Yingchen
    Xu, Ning
    Nie, Weizhi
    Li, Wenhui
    Zhang, Yongdong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (06) : 3685 - 3696
  • [34] Automatic image captioning system using a deep learning approach
    Deepak, Gerard
    Gali, Sowmya
    Sonker, Abhilash
    Jos, Bobin Cherian
    Sagar, K. V. Daya
    Singh, Charanjeet
    SOFT COMPUTING, 2023,
  • [35] Transformer based Multitask Learning for Image Captioning and Object Detection
    Basak, Debolena
    Srijith, P. K.
    Desarkar, Maunendra Sankar
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT II, PAKDD 2024, 2024, 14646 : 260 - 272
  • [36] AraCap: A hybrid deep learning architecture for Arabic Image Captioning
    Afyouni, Imad
    Azhar, Imtinan
    Elnagar, Ashraf
    AI IN COMPUTATIONAL LINGUISTICS, 2021, 189 : 382 - 389
  • [37] Image-Text Surgery: Efficient Concept Learning in Image Captioning by Generating Pseudopairs
    Fu, Kun
    Li, Jin
    Jin, Junqi
    Zhang, Changshui
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (12) : 5910 - 5921
  • [38] A Unified Visual and Linguistic Semantics Method for Enhanced Image Captioning
    Peng, Jiajia
    Tang, Tianbing
    APPLIED SCIENCES-BASEL, 2024, 14 (06):
  • [39] ROBUST IMAGE CAPTIONING WITH POST-GENERATION ENSEMBLE METHOD
    Ricci, Riccardo
    Melgani, Farid
    Marcato Junior, Jose
    Goncalves, Wesley Nunes
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 5234 - 5237
  • [40] Coastal Image Captioning
    Yang, Qiaoqiao
    Wang, Guangxing
    Zhang, Xiaoyu
    Grecos, Christos
    Ren, Peng
    JOURNAL OF COASTAL RESEARCH, 2020, : 145 - 150