Automated Generation of Chinese Text-Image Summaries Using Deep Learning Techniques

被引：0

作者：

Xu, Meiling ^{[1
,2
]}

Abd Rahman, Hayati ^{[1
]}

Li, Feng ^{[1
,2
]}

机构：

[1] Univ Teknol MARA, Coll Comp Informat & Math, Shah Alam 40450, Malaysia

[2] Hebei Finance Univ, Coll Comp & Informat Engn, Baoding 071051, Peoples R China

来源：

TRAITEMENT DU SIGNAL | 2023年 / 40卷 / 06期

关键词：

Chinese text-image summaries; automated summary generation; deep learning; MaliGAN; cross-modal similarity retrieval; adaptive fusion strategy;

D O I：

10.18280/ts.400644

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the era of the internet, an abundance of Chinese text-image content is continuously produced, necessitating effective automated technologies for processing and summarizing these materials. Automated generation of Chinese text-image summaries facilitates rapid comprehension of key information, thereby enhancing the efficiency of information consumption. Due to the unique characteristics of the Chinese language, traditional automatic summarization techniques are inadequately transferable, prompting the development of text-image summary generation technologies tailored to Chinese features. Research indicates that while existing natural language processing and deep learning techniques have made strides in text summarization, deficiencies remain in the deep semantic mining and integration of text-image content. This study primarily focuses on two aspects: Firstly, a generative approach based on an enhanced MaliGAN model, employing deep learning models to improve text generation quality. Secondly, a retrieval-based approach, utilizing cross-modal similarity retrieval to extract text information most relevant to the image content, guiding summary generation. Additionally, this study innovatively proposes a model architecture comprising segmentation, cross-modal retrieval, and adaptive fusion strategy modules, significantly augmenting the accuracy and reliability of text-image summary generation.

引用

页码：2835 / 2843

页数：9

共 20 条

[1] Review of chart image detection and classification
Bajic, Filip
Job, Josip
[J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2023, 26 (04) : 453 - 474
[2] Semantic Summarization of Reconstructed Abstract Meaning Representation Graph Structure Based on Integer Linear Pragramming
Chen Hongchang
Ming Tuosiyu
Liu Shuxin
Gao Chao
[J]. JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2019, 41 (07) : 1674 - 1681
[3] Deng Z., 2022, 5 INT C COMP INF SCI, P578, DOI [10.1117/12.2656552, DOI 10.1117/12.2656552]
[4] Kasi G., 2023, 2023 12 INT C ADV CO, P1, DOI [10.1109/ICoAC59537.2023.10250086, DOI 10.1109/ICOAC59537.2023.10250086]
[5] A Brief Survey of text driven image generation and maniulation
Lee, Hyeonjin
Ullah, Ubaid
Lee, Jeong-Sik
Jeong, Bomi
Choi, Hyun-Chul
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS-ASIA (ICCE-ASIA), 2021,
[6] Lee J, 2022, ROUTL ADV KOREAN STU, P1, DOI [10.4324/9780367823115-1, 10.1145/3491102.3501966]
[7] Lee LS, 2004, 2004 International Symposium on Chinese Spoken Language Processing, Proceedings, P329
[8] [李莹莹 Li Yingying], 2018, [计算机研究与发展, Journal of Computer Research and Development], V55, P1972
[9] Summarization of Text and Image Captioning in Information Retrieval Using Deep Learning Techniques
Mahalakshmi, P.
Fatima, N. Sabiyath
[J]. IEEE ACCESS, 2022, 10 : 18289 - 18297
[10] Patel Ishita, 2023, ICT Systems and Sustainability: Proceedings of ICT4SD 2023. Lecture Notes in Networks and Systems (765), P193, DOI 10.1007/978-981-99-5652-4_19

← 1 2 →