Automated Generation of Chinese Text-Image Summaries Using Deep Learning Techniques

被引:0
作者
Xu, Meiling [1 ,2 ]
Abd Rahman, Hayati [1 ]
Li, Feng [1 ,2 ]
机构
[1] Univ Teknol MARA, Coll Comp Informat & Math, Shah Alam 40450, Malaysia
[2] Hebei Finance Univ, Coll Comp & Informat Engn, Baoding 071051, Peoples R China
关键词
Chinese text-image summaries; automated summary generation; deep learning; MaliGAN; cross-modal similarity retrieval; adaptive fusion strategy;
D O I
10.18280/ts.400644
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the era of the internet, an abundance of Chinese text-image content is continuously produced, necessitating effective automated technologies for processing and summarizing these materials. Automated generation of Chinese text-image summaries facilitates rapid comprehension of key information, thereby enhancing the efficiency of information consumption. Due to the unique characteristics of the Chinese language, traditional automatic summarization techniques are inadequately transferable, prompting the development of text-image summary generation technologies tailored to Chinese features. Research indicates that while existing natural language processing and deep learning techniques have made strides in text summarization, deficiencies remain in the deep semantic mining and integration of text-image content. This study primarily focuses on two aspects: Firstly, a generative approach based on an enhanced MaliGAN model, employing deep learning models to improve text generation quality. Secondly, a retrieval-based approach, utilizing cross-modal similarity retrieval to extract text information most relevant to the image content, guiding summary generation. Additionally, this study innovatively proposes a model architecture comprising segmentation, cross-modal retrieval, and adaptive fusion strategy modules, significantly augmenting the accuracy and reliability of text-image summary generation.
引用
收藏
页码:2835 / 2843
页数:9
相关论文
共 20 条
  • [1] Review of chart image detection and classification
    Bajic, Filip
    Job, Josip
    [J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2023, 26 (04) : 453 - 474
  • [2] Semantic Summarization of Reconstructed Abstract Meaning Representation Graph Structure Based on Integer Linear Pragramming
    Chen Hongchang
    Ming Tuosiyu
    Liu Shuxin
    Gao Chao
    [J]. JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2019, 41 (07) : 1674 - 1681
  • [3] Deng Z., 2022, 5 INT C COMP INF SCI, P578, DOI [10.1117/12.2656552, DOI 10.1117/12.2656552]
  • [4] Kasi G., 2023, 2023 12 INT C ADV CO, P1, DOI [10.1109/ICoAC59537.2023.10250086, DOI 10.1109/ICOAC59537.2023.10250086]
  • [5] A Brief Survey of text driven image generation and maniulation
    Lee, Hyeonjin
    Ullah, Ubaid
    Lee, Jeong-Sik
    Jeong, Bomi
    Choi, Hyun-Chul
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS-ASIA (ICCE-ASIA), 2021,
  • [6] Lee J, 2022, ROUTL ADV KOREAN STU, P1, DOI [10.4324/9780367823115-1, 10.1145/3491102.3501966]
  • [7] Lee LS, 2004, 2004 International Symposium on Chinese Spoken Language Processing, Proceedings, P329
  • [8] [李莹莹 Li Yingying], 2018, [计算机研究与发展, Journal of Computer Research and Development], V55, P1972
  • [9] Summarization of Text and Image Captioning in Information Retrieval Using Deep Learning Techniques
    Mahalakshmi, P.
    Fatima, N. Sabiyath
    [J]. IEEE ACCESS, 2022, 10 : 18289 - 18297
  • [10] Patel Ishita, 2023, ICT Systems and Sustainability: Proceedings of ICT4SD 2023. Lecture Notes in Networks and Systems (765), P193, DOI 10.1007/978-981-99-5652-4_19