Visual Dialog with Multi-turn Attentional Memory Network

被引：2

作者：

Kong, Dejiang ^{[1
]}

Wu, Fei ^{[1
]}

机构：

[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Zhejiang, Peoples R China

来源：

ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I | 2018年 / 11164卷

基金：

中国国家自然科学基金;

关键词：

Visual dialog; Memory network; Multi-turn attention;

D O I：

10.1007/978-3-030-00776-8_56

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Visual dialog is a task of answering a question given an input image, a historical dialog about the image and often requires to retrieve visual and textual facts about the question. This problem is different from visual question answering (VQA), which only relies on visual grounding estimated from an image and question pair, while visual dialog task requires interactions among a question, an input image and a historical dialog. Most methods rely on one-turn attention network to obtain facts w.r.t. a question. However, the information transition phenomenon which exists in these facts restricts these methods to retrieve all relevant information. In this paper, we propose a multi-turn attentional memory network for visual dialog. Firstly, we propose a attentional memory network that maintains image regions and historical dialog in two memory banks and attends the question to be answered to both the visual and textual banks to obtain multi-model facts. Further, considering the information transition phenomenon, we design a multi-turn attention architecture which attend to memory banks multiple turns to retrieve more facts in order to produce a better answer. We evaluate the proposed model in on VisDial v0.9 dataset and the experimental results prove the effectiveness of the proposed model.

引用

页码：611 / 621

页数：11

共 50 条

[41] Memory network with hierarchical multi-head attention for aspect-based sentiment analysis
Yuzhong Chen
Tianhao Zhuang
Kun Guo
Applied Intelligence, 2021, 51 : 4287 - 4304
[42] Memory network with hierarchical multi-head attention for aspect-based sentiment analysis
Chen, Yuzhong
Zhuang, Tianhao
Guo, Kun
APPLIED INTELLIGENCE, 2021, 51 (07) : 4287 - 4304
[43] Multi-modal Knowledge-aware Event Memory Network for Social Media Rumor Detection
Zhang, Huaiwen
Fang, Quan
Qian, Shengsheng
Xu, Changsheng
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1942 - 1951
[44] Question-aware memory network for multi-hop question answering in human–robot interaction
Xinmeng Li
Mamoun Alazab
Qian Li
Keping Yu
Quanjun Yin
Complex & Intelligent Systems, 2022, 8 : 851 - 861
[45] An Attentive Memory Network Integrated with Aspect Dependency for Document-Level Multi-Aspect Sentiment Classification
Zhang, Qingxuan
Shi, Chongyang
ASIAN CONFERENCE ON MACHINE LEARNING, VOL 101, 2019, 101 : 425 - 440
[46] Question-aware memory network for multi-hop question answering in human-robot interaction
Li, Xinmeng
Alazab, Mamoun
Li, Qian
Yu, Keping
Yin, Quanjun
COMPLEX & INTELLIGENT SYSTEMS, 2022, 8 (02) : 851 - 861
[47] Multi-Granularity Position-Aware Convolutional Memory Network for Aspect-Based Sentiment Analysis
Pan, Yuanyuan
Gan, Jun
Ran, Xiangying
Wang, Chongjun
2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 728 - 735
[48] M-GCN: Brain-inspired memory graph convolutional network for multi-label image recognition
Xiao Yao
Feiyang Xu
Min Gu
Peipei Wang
Neural Computing and Applications, 2022, 34 : 6489 - 6502
[49] M-GCN: Brain-inspired memory graph convolutional network for multi-label image recognition
Yao, Xiao
Xu, Feiyang
Gu, Min
Wang, Peipei
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (08) : 6489 - 6502
[50] GCN-BERT and Memory Network Based Multi-Label Classification for Event Text of the Chinese Government Hotline
Bin Liu
IEEE ACCESS, 2022, 10 : 109267 - 109276

← 1 2 3 4 5 →