Visual Dialog with Multi-turn Attentional Memory Network

被引:2
|
作者
Kong, Dejiang [1 ]
Wu, Fei [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Zhejiang, Peoples R China
来源
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I | 2018年 / 11164卷
基金
中国国家自然科学基金;
关键词
Visual dialog; Memory network; Multi-turn attention;
D O I
10.1007/978-3-030-00776-8_56
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Visual dialog is a task of answering a question given an input image, a historical dialog about the image and often requires to retrieve visual and textual facts about the question. This problem is different from visual question answering (VQA), which only relies on visual grounding estimated from an image and question pair, while visual dialog task requires interactions among a question, an input image and a historical dialog. Most methods rely on one-turn attention network to obtain facts w.r.t. a question. However, the information transition phenomenon which exists in these facts restricts these methods to retrieve all relevant information. In this paper, we propose a multi-turn attentional memory network for visual dialog. Firstly, we propose a attentional memory network that maintains image regions and historical dialog in two memory banks and attends the question to be answered to both the visual and textual banks to obtain multi-model facts. Further, considering the information transition phenomenon, we design a multi-turn attention architecture which attend to memory banks multiple turns to retrieve more facts in order to produce a better answer. We evaluate the proposed model in on VisDial v0.9 dataset and the experimental results prove the effectiveness of the proposed model.
引用
收藏
页码:611 / 621
页数:11
相关论文
共 50 条
  • [21] Deception Detection Towards Multi-turn Question Answering with Context Selector Network
    Bao, Yinan
    Ma, Qianwen
    Wei, Lingwei
    Wang, Ding
    Zhou, Wei
    Hu, Songlin
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2022, 13629 : 304 - 315
  • [22] HSAN: A HIERARCHICAL SELF-ATTENTION NETWORK FOR MULTI-TURN DIALOGUE GENERATION
    Kong, Yawei
    Zhang, Lu
    Ma, Can
    Cao, Cong
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7433 - 7437
  • [23] Multi-Turn Video Question Generation via Reinforced Multi-Choice Attention Network
    Guo, Zhaoyu
    Zhao, Zhou
    Jin, Weike
    Wei, Zhicheng
    Yang, Min
    Wang, Nannan
    Yuan, Nicholas Jing
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (05) : 1697 - 1710
  • [24] Multi-View Attention Network for Visual Dialog
    Park, Sungjin
    Whang, Taesun
    Yoon, Yeochan
    Lim, Heuiseok
    APPLIED SCIENCES-BASEL, 2021, 11 (07):
  • [25] Resonant multi-turn extraction:: Principle and experiments
    Gilardoni, S.
    Giovannozzi, M.
    Martini, M.
    Métral, E.
    Scaramuzzi, P.
    Steerenberg, R.
    Mueller, A.-S.
    NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION A-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED EQUIPMENT, 2006, 561 (02): : 249 - 256
  • [26] MULTI-TURN ADJUSTMENT TRIMMER POTENTIOMETER.
    Monma, Kenji
    1600, (24):
  • [27] Realization of a multi-turn energy recovery accelerator
    Schliessmann, Felix
    Arnold, Michaela
    Juergensen, Lars
    Pietralla, Norbert
    Dutine, Manuel
    Fischer, Marco
    Grewe, Ruben
    Steinhorst, Manuel
    Stobbe, Lennart
    Weih, Simon
    NATURE PHYSICS, 2023, 19 (04) : 597 - +
  • [28] ANALYSIS OF THE ADVANTAGES AND DISADVANTAGES OF MULTI-TURN RAILGUN
    Zhang, Jiange
    Thompson, James E.
    Lu, Zan
    Islam, Naz E.
    2012 16TH INTERNATIONAL SYMPOSIUM ON ELECTROMAGNETIC LAUNCH TECHNOLOGY (EML), 2012,
  • [29] Realization of a multi-turn energy recovery accelerator
    Felix Schliessmann
    Michaela Arnold
    Lars Juergensen
    Norbert Pietralla
    Manuel Dutine
    Marco Fischer
    Ruben Grewe
    Manuel Steinhorst
    Lennart Stobbe
    Simon Weih
    Nature Physics, 2023, 19 : 597 - 602
  • [30] Event Extraction as Multi-turn Question Answering
    Li, Fayuan
    Peng, Weihua
    Chen, Yuguang
    Wang, Quan
    Pan, Lu
    Lyu, Yajuan
    Zhu, Yong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 829 - 838