Learning to Generate Questions by Learning to Recover Answer-containing Sentences

被引:0
|
作者
Back, Seohyun [1 ,2 ]
Kedia, Akhil [1 ]
Chinthakindi, Sai Chetan [1 ]
Lee, Haejun [1 ]
Choo, Jaegul [2 ]
机构
[1] Samsung Res, Seoul, South Korea
[2] Korea Adv Inst Sci & Technol, Daejeon, South Korea
基金
新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To train a question answering model based on machine reading comprehension (MRC), significant effort is required to prepare annotated training data composed of questions and their answers from contexts. Recent research has focused on synthetically generating a question from a given context and an annotated (or generated) answer by training an additional generative model to augment the training data. In light of this research direction, we propose a novel pre-training approach that learns to generate contextually rich questions, by recovering answer-containing sentences. We evaluate our method against existing ones in terms of the quality of generated questions, and fine-tuned MRC model accuracy after training on the data synthetically generated by our method. We consistently improve the question generation capability of existing models such as T5 and UniLM, and achieve state-of-the-art results on MS MARCO and NewsQA, and comparable results to the state-of-the-art on SQuAD. Additionally, the data synthetically generated by our approach is beneficial for boosting up the downstream MRC accuracy across a wide range of datasets, such as SQuAD-v1.1, v2.0, KorQuAD and BioASQ, without any modification to the existing MRC models. Furthermore, our method shines especially when a limited amount of pre-training or downstream MRC data is given.
引用
收藏
页码:1516 / 1529
页数:14
相关论文
共 50 条
  • [1] Learning to Generate Questions by Learning What not to Generate
    Liu, Bang
    Zhao, Mingjun
    Niu, Di
    Lai, Kunfeng
    He, Yancheng
    Wei, Haojie
    Xu, Yu
    WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 1106 - 1118
  • [2] AnswerNet: Learning to Answer Questions
    Wan, Zhiqiang
    He, Haibo
    IEEE TRANSACTIONS ON BIG DATA, 2019, 5 (04) : 540 - 549
  • [3] Neural architectures for learning to answer questions
    Monner, Derek
    Reggia, James A.
    BIOLOGICALLY INSPIRED COGNITIVE ARCHITECTURES, 2012, 2 : 37 - 53
  • [4] Learning to generate CGs from domain specific sentences
    Zhang, L
    Yu, Y
    CONCEPTUAL STRUCTURES: BROADENING THE BASE, PROCEEDINGS, 2001, 2120 : 44 - 57
  • [5] LEARNING TO GENERATE DIVERSE QUESTIONS FROM KEYWORDS
    Pan, Youcheng
    Hui, Baotian
    Chen, Qingcai
    Xiang, Yang
    Wang, Xiaolong
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8224 - 8228
  • [6] Learning to Generate Visual Questions with Noisy Supervision
    Shen, Kai
    Wu, Lingfei
    Tang, Siliang
    Zhuang, Yueting
    He, Zhen
    Ding, Zhuoye
    Xiao, Yun
    Long, Bo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [7] Learning to Answer Complex Questions with Evidence Graph
    Gu, Gao
    Li, Bohan
    Gao, Han
    Wang, Meng
    WEB AND BIG DATA, PT I, APWEB-WAIM 2020, 2020, 12317 : 257 - 269
  • [8] Learning to Generate Questions with Adaptive Copying Neural Networks
    Lu, Xinyuan
    SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, : 1838 - 1840
  • [9] Teaching and Learning Moments The Questions We Cannot Answer
    Waliany, Sarah
    ACADEMIC MEDICINE, 2016, 91 (01) : 25 - 25
  • [10] Automatic selection of informative sentences: The sentences that can generate multiple choice questions
    Majumder, Mukta
    Saha, Sujan Kumar
    KNOWLEDGE MANAGEMENT & E-LEARNING-AN INTERNATIONAL JOURNAL, 2014, 6 (04) : 377 - 391