Radial Graph Convolutional Network for Visual Question Generation

被引:42
|
作者
Xu, Xing [1 ,2 ]
Wang, Tan [1 ,2 ]
Yang, Yang [1 ,2 ]
Hanjalic, Alan [3 ]
Shen, Heng Tao [1 ,2 ]
机构
[1] Univ Elect Sci & Technol China, Ctr Future Multimedia, Chengdu 610051, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 610051, Peoples R China
[3] Delft Univ Technol, Sch Informat & Software Engn, NL-2628 CD Delft, Netherlands
基金
中国国家自然科学基金;
关键词
Task analysis; Visualization; Training; Data models; Semantics; Convolution; Cross-media understanding; graph convolutional network (GCN); visual question generation (VQG);
D O I
10.1109/TNNLS.2020.2986029
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we address the problem of visual question generation (VQG), a challenge in which a computer is required to generate meaningful questions about an image targeting a given answer. The existing approaches typically treat the VQG task as a reversed visual question answer (VQA) task, requiring the exhaustive match among all the image regions and the given answer. To reduce the complexity, we propose an innovative answer-centric approach termed radial graph convolutional network (Radial-GCN) to focus on the relevant image regions only. Our Radial-GCN method can quickly find the core answer area in an image by matching the latent answer with the semantic labels learned from all image regions. Then, a novel sparse graph of the radial structure is naturally built to capture the associations between the core node (i.e., answer area) and peripheral nodes (i.e., other areas); the graphic attention is subsequently adopted to steer the convolutional propagation toward potentially more relevant nodes for final question generation. Extensive experiments on three benchmark data sets show the superiority of our approach compared with the reference methods. Even in the unexplored challenging zero-shot VQA task, the synthesized questions by our method remarkably boost the performance of several state-of-the-art VQA methods from 0% to over 40%. The implementation code of our proposed method and the successfully generated questions are available at https://github.com/Wangt-CN/VQG-GCN.
引用
收藏
页码:1654 / 1667
页数:14
相关论文
共 50 条
  • [41] A Survey on Graph Convolutional Neural Network
    Xu B.-B.
    Cen K.-T.
    Huang J.-J.
    Shen H.-W.
    Cheng X.-Q.
    Jisuanji Xuebao/Chinese Journal of Computers, 2020, 43 (05): : 755 - 780
  • [42] Fuzzy Graph Subspace Convolutional Network
    Zhou, Jianhang
    Zhang, Qi
    Zeng, Shaoning
    Zhang, Bob
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 5641 - 5655
  • [43] Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos
    Zhang, Zongmeng
    Han, Xianjing
    Song, Xuemeng
    Yan, Yan
    Nie, Liqiang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 8265 - 8277
  • [44] Spatial Pooling Graph Convolutional Network for Hyperspectral Image Classification
    Zhang, Xiangrong
    Chen, Shutong
    Zhu, Peng
    Tang, Xu
    Feng, Jie
    Jiao, Licheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [45] Multiscale Dynamic Graph Convolutional Network for Hyperspectral Image Classification
    Wan, Sheng
    Gong, Chen
    Zhong, Ping
    Du, Bo
    Zhang, Lefei
    Yang, Jian
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (05): : 3162 - 3177
  • [46] Adversarial Graph Convolutional Network for Cross-Modal Retrieval
    Dong, Xinfeng
    Liu, Li
    Zhu, Lei
    Nie, Liqiang
    Zhang, Huaxiang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (03) : 1634 - 1645
  • [47] A Graph Convolutional Network With Multiple Dependency Representations for Relation Extraction
    Hu, Yanfeng
    Shen, Hong
    Liu, Wuling
    Min, Fei
    Qiao, Xue
    Jin, Kangrong
    IEEE ACCESS, 2021, 9 : 81575 - 81587
  • [48] Anisotropic Graph Convolutional Network for Semi-Supervised Learning
    Mesgaran, Mahsa
    Ben Hamzae, A.
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 3931 - 3942
  • [49] Traffic Message Channel Prediction Based on Graph Convolutional Network
    Li, Ning
    Jia, Shuangcheng
    Li, Qian
    IEEE ACCESS, 2021, 9 : 135423 - 135431
  • [50] Residual convolutional graph neural network with subgraph attention pooling
    Duan, Yutai
    Wang, Jianming
    Ma, Haoran
    Sun, Yukuan
    TSINGHUA SCIENCE AND TECHNOLOGY, 2022, 27 (04) : 653 - 663