Flexible Sentence Analysis Model for Visual Question Answering Network

被引：1

作者：

Deng, Wei ^{[1
]}

Wang, Jianming ^{[1
]}

Wang, Shengbei ^{[1
]}

Jin, Guanghao ^{[1
]}

机构：

[1] Tianjin Polytech Univ, Tianjin, Peoples R China

来源：

2018 2ND INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND BIOINFORMATICS (ICBEB 2018) | 2018年

基金：

中国国家自然科学基金;

关键词：

Applications for disabilities; visual question answering; flexible sentences analysis; ALZHEIMERS-DISEASE; LANGUAGE;

D O I：

10.1145/3278198.3278207

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Nowadays, visual question answering (VQA) has attracted much attention in both computer vision and natural language processing. Generally, a VQA system adopts sentence analysis models that decompose the sentence to short parts to analyze the user's attempt and merge partial results to get final answer. Despite the success of those models, the correct analysis of long length questions still remains as a key problem in VQA case. Especially, when a sentence produces comprehensive deviation due to different situation or customs of questioners, the sentence analysis model might output a wrong answer and lead to severe performance drop of the VQA system. To tackle the problem, a new sentence comprehension model has been proposed in this paper. The model is named flexible analysis model and is mainly used to deal with the sentences related to object counting. In human dialogue case, when the first answer went wrong, people would change a way to comprehend the sentence for finding the correct answer. Inspired by the mechanism, the flexible sentence analysis model tries another different way to comprehend the sentence after the sentence is given a wrong number answer, and the VQA system can generate a new answer according to the new output. Our model was tested on CLEVR dataset, and the experiment result shows that our method improved the accuracy nearly 10.5% in long sentence cases. It proves that our network has better performance on both correctness and robustness.

引用

页码：89 / 95

页数：7

共 36 条

[1] Neural Module Networks [J].

Andreas, Jacob ;

Rohrbach, Marcus ;

Darrell, Trevor ;

Klein, Dan .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :39-48

[2]

[Anonymous], 2015, COMPUTER SCI

[3]

[Anonymous], 2017, P IEEE INT C COMP VI

[4]

[Anonymous], 2017, NIPS 17

[5]

[Anonymous], 2016, P 2016 C N AM CHAPT, DOI DOI 10.18653/V1/N16-1181

[6]

[Anonymous], IEEE C COMP VIS PATT

[7]

[Anonymous], IEEE C COMP VIS PATT

[8]

[Anonymous], HLTH SOCIAL CARE COM

[9]

[Anonymous], INT J ALZHEIMERS DIS

[10]

[Anonymous], 2017, CVPR

← 1 2 3 4 →