Difficulty-Controllable Visual Question Generation

被引:7
作者
Chen, Feng [1 ,2 ]
Xie, Jiayuan [1 ,2 ]
Cai, Yi [1 ,2 ]
Wang, Tao [3 ]
Li, Qing [4 ]
机构
[1] South China Univ Technol, Sch Software Engn, Guangzhou, Peoples R China
[2] Key Lab Big Data & Intelligent Robot, Guangzhou, Peoples R China
[3] Hong Kong Polytech Univ, Dept Comp, Hung Hom, Hong Kong, Peoples R China
[4] Kings Coll London, Dept Biostat & Hlth Informat, London, England
来源
WEB AND BIG DATA, APWEB-WAIM 2021, PT I | 2021年 / 12858卷
基金
中国国家自然科学基金;
关键词
Difficulty controllable; Visual question generation; Multimodal; ATTENTION; OBJECTS;
D O I
10.1007/978-3-030-85896-4_26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual Question Generation (VQG) aims to generate questions from images. Existing studies on this topic focus on generating questions solely based on images while neglecting the difficulty of questions. However, to engage users, an automated question generator should produce questions with a level of difficulty that are tailored to a user's capabilities and experience. In this paper, we propose a Difficulty-controllable Generation Network (DGN) to alleviate this limitation. We borrow difficulty index from education area to define a difficulty variable for representing the difficulty of questions, and fuse it into our model to guide the difficulty-controllable question generation. Experimental results demonstrate that our proposed model not only achieves significant improvements on several automatic evaluation metrics, but also can generate difficulty-controllable questions.
引用
收藏
页码:332 / 347
页数:16
相关论文
共 48 条
[1]   Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering [J].
Anderson, Peter ;
He, Xiaodong ;
Buehler, Chris ;
Teney, Damien ;
Johnson, Mark ;
Gould, Stephen ;
Zhang, Lei .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6077-6086
[2]  
[Anonymous], 2017, P 55 ANN M ASS COMP, DOI DOI 10.18653/v1/P17-1123
[3]  
Chin-Yew Lin, 2004, Text Summarization Branches Out, P74
[4]  
Denkowski Michael, 2014, Proceedings of the ninth workshop on statistical machine translation
[5]  
Desai T., 2019, FLAIRS, P8
[6]  
dos Santos CN, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, P189
[7]   SHIFTING VISUAL-ATTENTION BETWEEN OBJECTS AND LOCATIONS - EVIDENCE FROM NORMAL AND PARIETAL LESION SUBJECTS [J].
EGLY, R ;
DRIVER, J ;
RAFAL, RD .
JOURNAL OF EXPERIMENTAL PSYCHOLOGY-GENERAL, 1994, 123 (02) :161-177
[8]  
Fan ZH, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P4048
[9]  
Gao YF, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P4968
[10]   Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering [J].
Goyal, Yash ;
Khot, Tejas ;
Summers-Stay, Douglas ;
Batra, Dhruv ;
Parikh, Devi .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6325-6334