Visual Dialog with Multi-turn Attentional Memory Network

被引：2

作者：

Kong, Dejiang ^{[1
]}

Wu, Fei ^{[1
]}

机构：

[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Zhejiang, Peoples R China

来源：

ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I | 2018年 / 11164卷

基金：

中国国家自然科学基金;

关键词：

Visual dialog; Memory network; Multi-turn attention;

D O I：

10.1007/978-3-030-00776-8_56

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Visual dialog is a task of answering a question given an input image, a historical dialog about the image and often requires to retrieve visual and textual facts about the question. This problem is different from visual question answering (VQA), which only relies on visual grounding estimated from an image and question pair, while visual dialog task requires interactions among a question, an input image and a historical dialog. Most methods rely on one-turn attention network to obtain facts w.r.t. a question. However, the information transition phenomenon which exists in these facts restricts these methods to retrieve all relevant information. In this paper, we propose a multi-turn attentional memory network for visual dialog. Firstly, we propose a attentional memory network that maintains image regions and historical dialog in two memory banks and attends the question to be answered to both the visual and textual banks to obtain multi-model facts. Further, considering the information transition phenomenon, we design a multi-turn attention architecture which attend to memory banks multiple turns to retrieve more facts in order to produce a better answer. We evaluate the proposed model in on VisDial v0.9 dataset and the experimental results prove the effectiveness of the proposed model.

引用

页码：611 / 621

页数：11

共 50 条

[1] Infusing Context and Knowledge Awareness in Multi-turn Dialog Understanding
Wu, Ting-Wei
Juang, Biing-Hwang
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 254 - 264
[2] A Context-Aware Hierarchical BERT Fusion Network for Multi-turn Dialog Act Detection
Wu, Ting-Wei
Su, Ruolin
Juang, Biing-Hwang
INTERSPEECH 2021, 2021, : 1239 - 1243
[3] Multi-Dimension Attention for Multi-Turn Dialog Generation (Student Abstract)
Belainine, Billal
Sadat, Fatiha
Boukadoum, Mounir
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 12909 - 12910
[4] Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog
Gao, Shen
Chen, Xiuying
Liu, Chang
Liu, Li
Zhao, Dongyan
Yan, Rui
WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, : 1138 - 1148
[5] Adaptive Visual Memory Network for Visual Dialog
Zhao L.
Gao L.
Song J.
Gao, Lianli (juana.alian@gmail.com), 1600, Univ. of Electronic Science and Technology of China (50): : 749 - 753
[6] Dense Semantic Matching Network for Multi-turn Conversation
Li, Yongrui
Yu, Jun
Wang, Zengfu
2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 1186 - 1191
[7] Memory Graph with Message Rehearsal for Multi-Turn Dialogue Generation
Cai, Xiaoyu
Fu, Yao
Zhao, Hong
Jiang, Weihao
Pu, Shiliang
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 108 - 117
[8] LEVERAGING SEMANTIC WEB SEARCH AND BROWSE SESSIONS FOR MULTI-TURN SPOKEN DIALOG SYSTEMS
Wang, Lu
Heck, Larry
Hakkani-Tuer, Dilek
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[9] Multi-turn Inference Matching Network for Natural Language Inference
Liu, Chunhua
Jiang, Shan
Yu, Hainan
Yu, Dong
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2018, PT II, 2018, 11109 : 131 - 143
[10] GuessWhich? Visual dialog with attentive memory network
Zhao, Lei
Lyu, Xinyu
Song, Jingkuan
Gao, Lianli
PATTERN RECOGNITION, 2021, 114

← 1 2 3 4 5 →