Visual Dialog with Multi-turn Attentional Memory Network

被引：2

作者：

Kong, Dejiang ^{[1
]}

Wu, Fei ^{[1
]}

机构：

[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Zhejiang, Peoples R China

来源：

ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I | 2018年 / 11164卷

基金：

中国国家自然科学基金;

关键词：

Visual dialog; Memory network; Multi-turn attention;

D O I：

10.1007/978-3-030-00776-8_56

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Visual dialog is a task of answering a question given an input image, a historical dialog about the image and often requires to retrieve visual and textual facts about the question. This problem is different from visual question answering (VQA), which only relies on visual grounding estimated from an image and question pair, while visual dialog task requires interactions among a question, an input image and a historical dialog. Most methods rely on one-turn attention network to obtain facts w.r.t. a question. However, the information transition phenomenon which exists in these facts restricts these methods to retrieve all relevant information. In this paper, we propose a multi-turn attentional memory network for visual dialog. Firstly, we propose a attentional memory network that maintains image regions and historical dialog in two memory banks and attends the question to be answered to both the visual and textual banks to obtain multi-model facts. Further, considering the information transition phenomenon, we design a multi-turn attention architecture which attend to memory banks multiple turns to retrieve more facts in order to produce a better answer. We evaluate the proposed model in on VisDial v0.9 dataset and the experimental results prove the effectiveness of the proposed model.

引用

页码：611 / 621

页数：11

共 50 条

[21] Deception Detection Towards Multi-turn Question Answering with Context Selector Network
Bao, Yinan
Ma, Qianwen
Wei, Lingwei
Wang, Ding
Zhou, Wei
Hu, Songlin
PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2022, 13629 : 304 - 315
[22] HSAN: A HIERARCHICAL SELF-ATTENTION NETWORK FOR MULTI-TURN DIALOGUE GENERATION
Kong, Yawei
Zhang, Lu
Ma, Can
Cao, Cong
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7433 - 7437
[23] Multi-Turn Video Question Generation via Reinforced Multi-Choice Attention Network
Guo, Zhaoyu
Zhao, Zhou
Jin, Weike
Wei, Zhicheng
Yang, Min
Wang, Nannan
Yuan, Nicholas Jing
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (05) : 1697 - 1710
[24] Multi-View Attention Network for Visual Dialog
Park, Sungjin
Whang, Taesun
Yoon, Yeochan
Lim, Heuiseok
APPLIED SCIENCES-BASEL, 2021, 11 (07):
[25] Resonant multi-turn extraction:: Principle and experiments
Gilardoni, S.
Giovannozzi, M.
Martini, M.
Métral, E.
Scaramuzzi, P.
Steerenberg, R.
Mueller, A.-S.
NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION A-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED EQUIPMENT, 2006, 561 (02): : 249 - 256
[26] MULTI-TURN ADJUSTMENT TRIMMER POTENTIOMETER.
Monma, Kenji
1600, (24):
[27] Realization of a multi-turn energy recovery accelerator
Schliessmann, Felix
Arnold, Michaela
Juergensen, Lars
Pietralla, Norbert
Dutine, Manuel
Fischer, Marco
Grewe, Ruben
Steinhorst, Manuel
Stobbe, Lennart
Weih, Simon
NATURE PHYSICS, 2023, 19 (04) : 597 - +
[28] ANALYSIS OF THE ADVANTAGES AND DISADVANTAGES OF MULTI-TURN RAILGUN
Zhang, Jiange
Thompson, James E.
Lu, Zan
Islam, Naz E.
2012 16TH INTERNATIONAL SYMPOSIUM ON ELECTROMAGNETIC LAUNCH TECHNOLOGY (EML), 2012,
[29] Realization of a multi-turn energy recovery accelerator
Felix Schliessmann
Michaela Arnold
Lars Juergensen
Norbert Pietralla
Manuel Dutine
Marco Fischer
Ruben Grewe
Manuel Steinhorst
Lennart Stobbe
Simon Weih
Nature Physics, 2023, 19 : 597 - 602
[30] Event Extraction as Multi-turn Question Answering
Li, Fayuan
Peng, Weihua
Chen, Yuguang
Wang, Quan
Pan, Lu
Lyu, Yajuan
Zhu, Yong
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 829 - 838

← 1 2 3 4 5 →