共 50 条
Visual Relationship Embedding Network for Image Paragraph Generation
被引:14
|作者:
Che, Wenbin
[1
,2
]
Fan, Xiaopeng
[1
,2
]
Xiong, Ruiqin
[3
]
Zhao, Debin
[1
,2
]
机构:
[1] Harbin Inst Technol, Res Ctr Intelligent Interface & Human Comp Intera, Dept Comp Sci & Technol, Harbin 150001, Peoples R China
[2] PengCheng Lab, Shenzhen 518055, Peoples R China
[3] Peking Univ, Inst Digital Media, Sch Elect Engn & Comp Sci, Beijing 100871, Peoples R China
基金:
美国国家科学基金会;
关键词:
Visualization;
Semantics;
Task analysis;
Proposals;
Automobiles;
Buildings;
Gallium nitride;
Paragraph generation;
image caption;
region localization;
attention network;
visual relationship;
GAN;
LSTM;
LANGUAGE;
D O I:
10.1109/TMM.2019.2954750
中图分类号:
TP [自动化技术、计算机技术];
学科分类号:
0812 ;
摘要:
Image paragraph generation aims to produce a complete description of a given image. This task is more challenging than image captioning, which only generates one sentence to describe the entire image. Traditional paragraph generation methods usually produce paragraph descriptions based on individual regions that are detected by a Region Proposal Network (RPN). However, relationships among visual objects are either ignored or utilized in an implicit manner in previous work. In this paper, we attempt to explore more visual information through a novel paragraph generation network that explicitly incorporates visual relationship semantics when producing descriptions. First, a novel Relation Pair Generative Adversarial Network (RP-GAN) is designed to locate regions that may cover subjective or objective elements. Then, their relationships are inferred through an attention-based network. Finally, the visual features and relationship semantics of valid relation pairs are taken as inputs by a Long Short-Term Memory (LSTM) network for generating sentences. The experimental results show that by explicitly utilizing the predicted relationship information, our proposed method obtains more accurate and informative paragraph descriptions than previous methods.
引用
收藏
页码:2307 / 2320
页数:14
相关论文