Visual Relationship Embedding Network for Image Paragraph Generation

被引：14

作者：

Che, Wenbin ^{[1
,2
]}

Fan, Xiaopeng ^{[1
,2
]}

Xiong, Ruiqin ^{[3
]}

Zhao, Debin ^{[1
,2
]}

机构：

[1] Harbin Inst Technol, Res Ctr Intelligent Interface & Human Comp Intera, Dept Comp Sci & Technol, Harbin 150001, Peoples R China

[2] PengCheng Lab, Shenzhen 518055, Peoples R China

[3] Peking Univ, Inst Digital Media, Sch Elect Engn & Comp Sci, Beijing 100871, Peoples R China

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2020年 / 22卷 / 09期

基金：

美国国家科学基金会;

关键词：

Visualization; Semantics; Task analysis; Proposals; Automobiles; Buildings; Gallium nitride; Paragraph generation; image caption; region localization; attention network; visual relationship; GAN; LSTM; LANGUAGE;

D O I：

10.1109/TMM.2019.2954750

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Image paragraph generation aims to produce a complete description of a given image. This task is more challenging than image captioning, which only generates one sentence to describe the entire image. Traditional paragraph generation methods usually produce paragraph descriptions based on individual regions that are detected by a Region Proposal Network (RPN). However, relationships among visual objects are either ignored or utilized in an implicit manner in previous work. In this paper, we attempt to explore more visual information through a novel paragraph generation network that explicitly incorporates visual relationship semantics when producing descriptions. First, a novel Relation Pair Generative Adversarial Network (RP-GAN) is designed to locate regions that may cover subjective or objective elements. Then, their relationships are inferred through an attention-based network. Finally, the visual features and relationship semantics of valid relation pairs are taken as inputs by a Long Short-Term Memory (LSTM) network for generating sentences. The experimental results show that by explicitly utilizing the predicted relationship information, our proposed method obtains more accurate and informative paragraph descriptions than previous methods.

引用

页码：2307 / 2320

页数：14

共 50 条

[41] Realistic Image Generation from Text by Using BERT-Based Embedding
Na, Sanghyuck
Do, Mirae
Yu, Kyeonah
Kim, Juntae
ELECTRONICS, 2022, 11 (05)
[42] EAES: Effective Augmented Embedding Spaces for Text-Based Image Captioning
Khang Nguyen
Bui, Doanh C.
Truc Trinh
Vo, Nguyen D.
IEEE ACCESS, 2022, 10 : 32443 - 32452
[43] A Hierarchical Context Embedding Network for Object Detection in Remote Sensing Images
Zhang, Ke
Wu, Yulin
Wang, Jingyu
Wang, Qi
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[44] KE-RSIC: Remote Sensing Image Captioning Based on Knowledge Embedding
Cheng, Kangda
Cambria, Erik
Liu, Jinlong
Chen, Yushi
Wu, Zhilu
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 4286 - 4304
[45] Remote Sensing Image Synthesis via Semantic Embedding Generative Adversarial Networks
Wang, Chendan
Chen, Bowen
Zou, Zhengxia
Shi, Zhenwei
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[46] GANE: A Generative Adversarial Network Embedding
Hong, Huiting
Li, Xin
Wang, Mingzhong
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (07) : 2325 - 2335
[47] Semi-Heterogeneous Three-Way Joint Embedding Network for Sketch-Based Image Retrieval
Lei, Jianjun
Song, Yuxin
Peng, Bo
Ma, Zhanyu
Shao, Ling
Song, Yi-Zhe
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (09) : 3226 - 3237
[48] Context-Aware Visual Policy Network for Fine-Grained Image Captioning
Zha, Zheng-Jun
Liu, Daqing
Zhang, Hanwang
Zhang, Yongdong
Wu, Feng
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (02) : 710 - 722
[49] Improving paragraph-level question generation with extended answer network and uncertainty-aware beam search
Zeng, Hongwei
Zhi, Zhuo
Liu, Jun
Wei, Bifan
INFORMATION SCIENCES, 2021, 571 : 50 - 64
[50] Hierarchical Deep Embedding for Aurora Image Retrieval
Yang, Xi
Gao, Xinbo
Song, Bin
Han, Bing
IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (12) : 5773 - 5785

← 1 2 3 4 5 →