Predicting semantic category of answers for question answering systems using transformers: a transfer learning approach

被引:1
作者
Suneera, C. M. [1 ]
Prakash, Jay [1 ]
Alaparthi, Varun Sai [1 ]
机构
[1] Natl Inst Technol Calicut, Dept Comp Sci & Engn, Kozhikode 673601, Kerala, India
关键词
Natural language processing; Question classification; Deep learning; Transfer learning; Transformers; CLASSIFICATION; ATTENTION;
D O I
10.1007/s11042-024-18609-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A question-answering (QA) system is a key application in the field of natural language processing (NLP) that provides relevant answers to user queries written in natural language. In factoid QA using knowledge bases, predicting the semantic category of answers, such as location, person, or numerical values, helps to reduce the search spaces and is an essential step in formal query construction for answer retrieval. However, finding the semantics in sequence data like questions is challenging. In this regard, Recursive neural networks based deep learning methods have been applied. But, they are inefficient in handling long-term dependencies. Recently, pre-trained language models employing transformers are proven effective and can generate context-dependent embedding for words and sentences from their encoders with attention mechanisms. However, to train an efficient transformer model for semantic category prediction requires a large dataset and high computational resources. Therefore, in this work, we employ a transfer learning approach using pre-trained transformer models by efficiently adapting them to predict the semantic category of answers from input questions. Here, embeddings from the encoders of the text-to-text transfer transformer (T5) model have been leveraged to obtain an efficient question representation and to train the classification model which is named as QcT5. Along with QcT5, an extensive experimental study on other recent transformer models - BERT, RoBERTa, DeBERTa, and XLNet, is conducted, and their performance is analyzed in various fine-tuning settings. Experimental results indicate that the QcT5 model significantly improves the performance compared to the selected state-of-the-art methods by achieving an f1-score of 98.7%, 89.9% on TREC-6, and TREC-50 datasets respectively.
引用
收藏
页码:77393 / 77413
页数:21
相关论文
共 39 条
  • [1] Rating Ease of Readability using Transformers
    Alaparthi, Varun Sai
    Pawar, Ajay Abhaysing
    Suneera, C. M.
    Prakash, Jay
    [J]. 2022 14TH INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING (ICCAE 2022), 2022, : 117 - 121
  • [2] Alsentzer Emily, 2019, P 2 CLIN NAT LANG PR, V2, P72, DOI [DOI 10.18653/V1/W19-1909, 10.18653/v1/W19-1909]
  • [3] Malware classification using word embeddings algorithms and long-short term memory networks
    Andrade, Eduardo de O.
    Viterbo, Jose
    Guerin, Joris
    Bernardini, Flavia
    [J]. COMPUTATIONAL INTELLIGENCE, 2022, 38 (05) : 1802 - 1830
  • [4] Brown TB, 2020, ADV NEUR IN, V33
  • [5] A literature review on question answering techniques, paradigms and systems
    Calijorne Soares, Marco Antonio
    Parreiras, Fernando Silva
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2020, 32 (06) : 635 - 646
  • [6] Nguyen DQ, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING: SYSTEM DEMONSTRATIONS, P9
  • [7] Demirkaya A, 2020, 2020 54 ANN C INFORM, P1
  • [8] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [9] A survey on question answering systems over linked data and documents
    Dimitrakis, Eleftherios
    Sgontzos, Konstantinos
    Tzitzikas, Yannis
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2020, 55 (02) : 233 - 259
  • [10] Guo Y., 2020, P 18 ANN WORKSH AUST, P86