Radial Graph Convolutional Network for Visual Question Generation

被引：42

作者：

Xu, Xing ^{[1
,2
]}

Wang, Tan ^{[1
,2
]}

Yang, Yang ^{[1
,2
]}

Hanjalic, Alan ^{[3
]}

Shen, Heng Tao ^{[1
,2
]}

机构：

[1] Univ Elect Sci & Technol China, Ctr Future Multimedia, Chengdu 610051, Peoples R China

[2] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 610051, Peoples R China

[3] Delft Univ Technol, Sch Informat & Software Engn, NL-2628 CD Delft, Netherlands

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2021年 / 32卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Task analysis; Visualization; Training; Data models; Semantics; Convolution; Cross-media understanding; graph convolutional network (GCN); visual question generation (VQG);

D O I：

10.1109/TNNLS.2020.2986029

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this article, we address the problem of visual question generation (VQG), a challenge in which a computer is required to generate meaningful questions about an image targeting a given answer. The existing approaches typically treat the VQG task as a reversed visual question answer (VQA) task, requiring the exhaustive match among all the image regions and the given answer. To reduce the complexity, we propose an innovative answer-centric approach termed radial graph convolutional network (Radial-GCN) to focus on the relevant image regions only. Our Radial-GCN method can quickly find the core answer area in an image by matching the latent answer with the semantic labels learned from all image regions. Then, a novel sparse graph of the radial structure is naturally built to capture the associations between the core node (i.e., answer area) and peripheral nodes (i.e., other areas); the graphic attention is subsequently adopted to steer the convolutional propagation toward potentially more relevant nodes for final question generation. Extensive experiments on three benchmark data sets show the superiority of our approach compared with the reference methods. Even in the unexplored challenging zero-shot VQA task, the synthesized questions by our method remarkably boost the performance of several state-of-the-art VQA methods from 0% to over 40%. The implementation code of our proposed method and the successfully generated questions are available at https://github.com/Wangt-CN/VQG-GCN.

引用

页码：1654 / 1667

页数：14

共 50 条

[21] ADGCN: An Asynchronous Dilation Graph Convolutional Network for Traffic Flow Prediction
Qi, Tao
Li, Guanghui
Chen, Lingqiang
Xue, Yanming
IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (05) : 4001 - 4014
[22] Latent Attention Network With Position Perception for Visual Question Answering
Zhang, Jing
Liu, Xiaoqiang
Wang, Zhe
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 11
[23] Topological Graph Convolutional Network Based on Complex Network Characteristics
Gao, He
Yu, Xiang
Sui, Yi
Shao, Fengjing
Sun, Rencheng
IEEE ACCESS, 2022, 10 : 64465 - 64472
[24] A Convolutional Neural Network and Graph Convolutional Network Based Framework for Classification of Breast Histopathological Images
Gao, Zhiyang
Lu, Zhiyang
Wang, Jun
Ying, Shihui
Shi, Jun
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (07) : 3163 - 3173
[25] Knowledge-Based Visual Question Generation
Xie, Jiayuan
Fang, Wenhao
Cai, Yi
Huang, Qingbao
Li, Qing
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (11) : 7547 - 7558
[26] Multitask Learning for Visual Question Answering
Ma, Jie
Liu, Jun
Lin, Qika
Wu, Bei
Wang, Yaxian
You, Yang
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (03) : 1380 - 1394
[27] Graph-Based Multi-Interaction Network for Video Question Answering
Gu, Mao
Zhao, Zhou
Jin, Weike
Hong, Richang
Wu, Fei
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 2758 - 2770
[28] Joint Graph Attention and Asymmetric Convolutional Neural Network for Deep Image Compression
Tang, Zhisen
Wang, Hanli
Yi, Xiaokai
Zhang, Yun
Kwong, Sam
Kuo, C. -C. Jay
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (01) : 421 - 433
[29] Recursive Multi-Relational Graph Convolutional Network for Automatic Photo Selection
Xu, Wujiang
Xu, Yifei
Sang, Genan
Li, Li
Wang, Aichen
Wei, Pingping
Zhu, Li
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 3825 - 3840
[30] Adaptive Semantic-Spatio-Temporal Graph Convolutional Network for Lip Reading
Sheng, Changchong
Zhu, Xinzhong
Xu, Huiying
Pietikainen, Matti
Liu, Li
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 24 : 3545 - 3557

← 1 2 3 4 5 →